Single-Channel Speaker Distance Estimation using the Short-Term Power of the Autocorrelation

Sound distance estimation is a key, yet comparatively underexplored, component of sound source localization within sound event localization and detection systems, particularly in single-channel (monaural) scenarios where spatial cues are limited. This work investigates the integration of a reverberation-oriented feature, the short-term power of the autocorrelation coefficients, into an existing convolutional recurrent neural network model for monaural speaker distance estimation. The proposed approach replaces phase-based features with the new feature while retaining the magnitude of the short-time Fourier transform. Experiments are conducted on a synthetic dataset, with added real background noise. Results show that, although phase-based features achieve higher accuracy in noiseless conditions, the reverberation-oriented feature provides more stable performance across varying noise levels.

Single-Channel Speaker Distance Estimation using the Short-Term Power of the Autocorrelation

TOMADA, RICCARDO

2025/2026

Abstract

Sound distance estimation is a key, yet comparatively underexplored, component of sound source localization within sound event localization and detection systems, particularly in single-channel (monaural) scenarios where spatial cues are limited. This work investigates the integration of a reverberation-oriented feature, the short-term power of the autocorrelation coefficients, into an existing convolutional recurrent neural network model for monaural speaker distance estimation. The proposed approach replaces phase-based features with the new feature while retaining the magnitude of the short-time Fourier transform. Experiments are conducted on a synthetic dataset, with added real background noise. Results show that, although phase-based features achieve higher accuracy in noiseless conditions, the reverberation-oriented feature provides more stable performance across varying noise levels.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				INGEGNERIA DELL'INFORMAZIONE Laurea di Primo Livello (D.M. 270/2004)
			
	Anno Accademico
	
				2025
			
	Titolo inglese
	
				Single-Channel Speaker Distance Estimation using the Short-Term Power of the Autocorrelation
			
	Abstract in italiano
	
				Sound distance estimation is a key, yet comparatively underexplored, component of sound source localization within sound event localization and detection systems, particularly in single-channel (monaural) scenarios where spatial cues are limited. This work investigates the integration of a reverberation-oriented feature, the short-term power of the autocorrelation coefficients, into an existing convolutional recurrent neural network model for monaural speaker distance estimation. The proposed approach replaces phase-based features with the new feature while retaining the magnitude of the short-time Fourier transform. Experiments are conducted on a synthetic dataset, with added real background noise. Results show that, although phase-based features achieve higher accuracy in noiseless conditions, the reverberation-oriented feature provides more stable performance across varying noise levels.
			
	Parola chiave
	
				Autocorrelation
SELD
Distance Estimation
			
	Relatore
	
				BATTISTI, FEDERICA
			
	Appare nelle tipologie:
	
				Lauree triennali

File in questo prodotto:

File	Dimensione	Formato
Tomada_Riccardo.pdf accesso aperto Dimensione 489.09 kB Formato Adobe PDF Visualizza/Apri	489.09 kB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/104356