In the context of loudspeaker design and manufacturing, impulse response pre-compensation is a broadly studied topic. While standard methods based on DSP techniques achieve satisfactory results, they require a tractable mathematical representation of the system at hand, often obtained through system identification-like procedures. This thesis project investigates the possibility of approaching the same task from a different perspective, i.e. by employing neural networks, under the assumption that no analytical model of the loudspeaker is known. Neural architectures may be good candidates for tackling the problem given their strong representational power which makes them suitable especially in highly non-linear spaces. In the context of deep learning, training an end-to-end system typically requires information on the derivatives of all the functions involved. This work explores two different methods to accomplish impulse response pre-compensation using neural networks without gradients: the first one consists in efficiently computing an unbiased (in the limit) estimator of the gradient of the loudspeaker model using the Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm, while the other consists in employing a second network to learn a differentiable approximation of the speaker function and using that to train the target pre-compensating network. Both the approaches will be compared to the baseline and the quality of the results are assessed through objective metrics directly measured on the output waveforms.

In the context of loudspeaker design and manufacturing, impulse response pre-compensation is a broadly studied topic. While standard methods based on DSP techniques achieve satisfactory results, they require a tractable mathematical representation of the system at hand, often obtained through system identification-like procedures. This thesis project investigates the possibility of approaching the same task from a different perspective, i.e. by employing neural networks, under the assumption that no analytical model of the loudspeaker is known. Neural architectures may be good candidates for tackling the problem given their strong representational power which makes them suitable especially in highly non-linear spaces. In the context of deep learning, training an end-to-end system typically requires information on the derivatives of all the functions involved. This work explores two different methods to accomplish impulse response pre-compensation using neural networks without gradients: the first one consists in efficiently computing an unbiased (in the limit) estimator of the gradient of the loudspeaker model using the Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm, while the other consists in employing a second network to learn a differentiable approximation of the speaker function and using that to train the target pre-compensating network. Both the approaches will be compared to the baseline and the quality of the results are assessed through objective metrics directly measured on the output waveforms.

Deep learning-based loudspeaker compensation system using gradient-free stochastic optimization

TRABUCCO, GIOVANNI
2022/2023

Abstract

In the context of loudspeaker design and manufacturing, impulse response pre-compensation is a broadly studied topic. While standard methods based on DSP techniques achieve satisfactory results, they require a tractable mathematical representation of the system at hand, often obtained through system identification-like procedures. This thesis project investigates the possibility of approaching the same task from a different perspective, i.e. by employing neural networks, under the assumption that no analytical model of the loudspeaker is known. Neural architectures may be good candidates for tackling the problem given their strong representational power which makes them suitable especially in highly non-linear spaces. In the context of deep learning, training an end-to-end system typically requires information on the derivatives of all the functions involved. This work explores two different methods to accomplish impulse response pre-compensation using neural networks without gradients: the first one consists in efficiently computing an unbiased (in the limit) estimator of the gradient of the loudspeaker model using the Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm, while the other consists in employing a second network to learn a differentiable approximation of the speaker function and using that to train the target pre-compensating network. Both the approaches will be compared to the baseline and the quality of the results are assessed through objective metrics directly measured on the output waveforms.
2022
Deep learning-based loudspeaker compensation system using gradient-free stochastic optimization
In the context of loudspeaker design and manufacturing, impulse response pre-compensation is a broadly studied topic. While standard methods based on DSP techniques achieve satisfactory results, they require a tractable mathematical representation of the system at hand, often obtained through system identification-like procedures. This thesis project investigates the possibility of approaching the same task from a different perspective, i.e. by employing neural networks, under the assumption that no analytical model of the loudspeaker is known. Neural architectures may be good candidates for tackling the problem given their strong representational power which makes them suitable especially in highly non-linear spaces. In the context of deep learning, training an end-to-end system typically requires information on the derivatives of all the functions involved. This work explores two different methods to accomplish impulse response pre-compensation using neural networks without gradients: the first one consists in efficiently computing an unbiased (in the limit) estimator of the gradient of the loudspeaker model using the Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm, while the other consists in employing a second network to learn a differentiable approximation of the speaker function and using that to train the target pre-compensating network. Both the approaches will be compared to the baseline and the quality of the results are assessed through objective metrics directly measured on the output waveforms.
Machine learning
neural networks
DSP
File in questo prodotto:
File Dimensione Formato  
Trabucco_Giovanni.pdf

accesso riservato

Dimensione 8.87 MB
Formato Adobe PDF
8.87 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/46081