This thesis presents an automatic speech recognition (ASR) system fine-tuned for the accurate extraction of Italian tax codes (codici fiscali) from spoken input. The work focuses on adapting general-purpose ASR models to domain-specific tasks, where precision in recognizing structured alphanumeric sequences is essential. The system is built on Whisper, a multilingual model developed by OpenAI, and has been adapted to improve accuracy in detecting tax codes pronounced in natural speech. A dedicated validation and error-checking mechanism ensures that only syntactically and logically valid codes are accepted, reducing the impact of minor transcription errors and improving robustness in real-world scenarios. This project demonstrates the effectiveness of fine-tuned ASR systems in specialized contexts and shows promise for applications in administrative, legal, or customer-service domains where accurate extraction of formal identifiers from speech is required.

This thesis presents an automatic speech recognition (ASR) system fine-tuned for the accurate extraction of Italian tax codes (codici fiscali) from spoken input. The work focuses on adapting general-purpose ASR models to domain-specific tasks, where precision in recognizing structured alphanumeric sequences is essential. The system is built on Whisper, a multilingual model developed by OpenAI, and has been adapted to improve accuracy in detecting tax codes pronounced in natural speech. A dedicated validation and error-checking mechanism ensures that only syntactically and logically valid codes are accepted, reducing the impact of minor transcription errors and improving robustness in real-world scenarios. This project demonstrates the effectiveness of fine-tuned ASR systems in specialized contexts and shows promise for applications in administrative, legal, or customer-service domains where accurate extraction of formal identifiers from speech is required.

Fine-tuning of pre-trained ASR models for transcription of Italian fiscal codes

BORASO, FRANCESCO
2024/2025

Abstract

This thesis presents an automatic speech recognition (ASR) system fine-tuned for the accurate extraction of Italian tax codes (codici fiscali) from spoken input. The work focuses on adapting general-purpose ASR models to domain-specific tasks, where precision in recognizing structured alphanumeric sequences is essential. The system is built on Whisper, a multilingual model developed by OpenAI, and has been adapted to improve accuracy in detecting tax codes pronounced in natural speech. A dedicated validation and error-checking mechanism ensures that only syntactically and logically valid codes are accepted, reducing the impact of minor transcription errors and improving robustness in real-world scenarios. This project demonstrates the effectiveness of fine-tuned ASR systems in specialized contexts and shows promise for applications in administrative, legal, or customer-service domains where accurate extraction of formal identifiers from speech is required.
2024
Fine-tuning of pre-trained ASR models for transcription of Italian fiscal codes
This thesis presents an automatic speech recognition (ASR) system fine-tuned for the accurate extraction of Italian tax codes (codici fiscali) from spoken input. The work focuses on adapting general-purpose ASR models to domain-specific tasks, where precision in recognizing structured alphanumeric sequences is essential. The system is built on Whisper, a multilingual model developed by OpenAI, and has been adapted to improve accuracy in detecting tax codes pronounced in natural speech. A dedicated validation and error-checking mechanism ensures that only syntactically and logically valid codes are accepted, reducing the impact of minor transcription errors and improving robustness in real-world scenarios. This project demonstrates the effectiveness of fine-tuned ASR systems in specialized contexts and shows promise for applications in administrative, legal, or customer-service domains where accurate extraction of formal identifiers from speech is required.
Fine Tuning
Pre Trained Models
ASR
Transcription
Italian Fiscal Codes
File in questo prodotto:
File Dimensione Formato  
Master_Thesis.pdf

Accesso riservato

Dimensione 878.49 kB
Formato Adobe PDF
878.49 kB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/91850