Fine-tuning of pre-trained ASR models for transcription of Italian fiscal codes

This thesis presents an automatic speech recognition (ASR) system fine-tuned for the accurate extraction of Italian tax codes (codici fiscali) from spoken input. The work focuses on adapting general-purpose ASR models to domain-specific tasks, where precision in recognizing structured alphanumeric sequences is essential. The system is built on Whisper, a multilingual model developed by OpenAI, and has been adapted to improve accuracy in detecting tax codes pronounced in natural speech. A dedicated validation and error-checking mechanism ensures that only syntactically and logically valid codes are accepted, reducing the impact of minor transcription errors and improving robustness in real-world scenarios. This project demonstrates the effectiveness of fine-tuned ASR systems in specialized contexts and shows promise for applications in administrative, legal, or customer-service domains where accurate extraction of formal identifiers from speech is required.

Fine-tuning of pre-trained ASR models for transcription of Italian fiscal codes

BORASO, FRANCESCO

2024/2025

Abstract

This thesis presents an automatic speech recognition (ASR) system fine-tuned for the accurate extraction of Italian tax codes (codici fiscali) from spoken input. The work focuses on adapting general-purpose ASR models to domain-specific tasks, where precision in recognizing structured alphanumeric sequences is essential. The system is built on Whisper, a multilingual model developed by OpenAI, and has been adapted to improve accuracy in detecting tax codes pronounced in natural speech. A dedicated validation and error-checking mechanism ensures that only syntactically and logically valid codes are accepted, reducing the impact of minor transcription errors and improving robustness in real-world scenarios. This project demonstrates the effectiveness of fine-tuned ASR systems in specialized contexts and shows promise for applications in administrative, legal, or customer-service domains where accurate extraction of formal identifiers from speech is required.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				COMPUTER SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Fine-tuning of pre-trained ASR models for transcription of Italian fiscal codes
			
	Abstract in italiano
	
				This thesis presents an automatic speech recognition (ASR) system fine-tuned for the accurate extraction of Italian tax codes (codici fiscali) from spoken input. The work focuses on adapting general-purpose ASR models to domain-specific tasks, where precision in recognizing structured alphanumeric sequences is essential.
The system is built on Whisper, a multilingual model developed by OpenAI, and has been adapted to improve accuracy in detecting tax codes pronounced in natural speech. A dedicated validation and error-checking mechanism ensures that only syntactically and logically valid codes are accepted, reducing the impact of minor transcription errors and improving robustness in real-world scenarios.
This project demonstrates the effectiveness of fine-tuned ASR systems in specialized contexts and shows promise for applications in administrative, legal, or customer-service domains where accurate extraction of formal identifiers from speech is required.
			
	Parola chiave
	
				Fine Tuning
Pre Trained Models
ASR
Transcription
Italian Fiscal Codes
			
	Relatore
	
				SUSTO, GIAN ANTONIO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Master_Thesis.pdf Accesso riservato Dimensione 878.49 kB Formato Adobe PDF	878.49 kB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/91850