This thesis explores the automated generation and classification of short sound messages, referred to as TIScode, for the Internet of Audio Things (IoAuT). A TIScode is a brief audio sequence, lasting 4-5 seconds, carrying digital information that can be recognized by a specific smartphone application. The work focuses on developing methodologies for the various stages of the TIScode pipeline, including generation, transmission, and ultimately, reception and decoding. For the generation phase, MusicGen, a state-of-the-art autoregressive transformer model, is proposed, along with an analysis of its degrees of freedom to maximize the variety of distinct audio tracks that can be generated. Additionally, a channel coding system based on the quantization of sound features and certain high-level features extracted through convolutional neural networks (CNNs) is introduced. These features are mapped to create a unique bitmap for each TIScode, simplifying the decoding process. An algorithm is presented for the recognition phase, combining sound feature analysis with frequency-based peak analysis to enhance detection accuracy. Experimental results, obtained through simulations and field tests, demonstrate the effectiveness of the system in retrieving the digital information encoded within sound messages.
This thesis explores the automated generation and classification of short sound messages, referred to as TIScode, for the Internet of Audio Things (IoAuT). A TIScode is a brief audio sequence, lasting 4-5 seconds, carrying digital information that can be recognized by a specific smartphone application. The work focuses on developing methodologies for the various stages of the TIScode pipeline, including generation, transmission, and ultimately, reception and decoding. For the generation phase, MusicGen, a state-of-the-art autoregressive transformer model, is proposed, along with an analysis of its degrees of freedom to maximize the variety of distinct audio tracks that can be generated. Additionally, a channel coding system based on the quantization of sound features and certain high-level features extracted through convolutional neural networks (CNNs) is introduced. These features are mapped to create a unique bitmap for each TIScode, simplifying the decoding process. An algorithm is presented for the recognition phase, combining sound feature analysis with frequency-based peak analysis to enhance detection accuracy. Experimental results, obtained through simulations and field tests, demonstrate the effectiveness of the system in retrieving the digital information encoded within sound messages.
AI driven generation and classification of short sound messages for Internet of Audio Things
FAVERO, MANUELE
2023/2024
Abstract
This thesis explores the automated generation and classification of short sound messages, referred to as TIScode, for the Internet of Audio Things (IoAuT). A TIScode is a brief audio sequence, lasting 4-5 seconds, carrying digital information that can be recognized by a specific smartphone application. The work focuses on developing methodologies for the various stages of the TIScode pipeline, including generation, transmission, and ultimately, reception and decoding. For the generation phase, MusicGen, a state-of-the-art autoregressive transformer model, is proposed, along with an analysis of its degrees of freedom to maximize the variety of distinct audio tracks that can be generated. Additionally, a channel coding system based on the quantization of sound features and certain high-level features extracted through convolutional neural networks (CNNs) is introduced. These features are mapped to create a unique bitmap for each TIScode, simplifying the decoding process. An algorithm is presented for the recognition phase, combining sound feature analysis with frequency-based peak analysis to enhance detection accuracy. Experimental results, obtained through simulations and field tests, demonstrate the effectiveness of the system in retrieving the digital information encoded within sound messages.File | Dimensione | Formato | |
---|---|---|---|
Favero_Manuele.pdf
accesso aperto
Dimensione
7.01 MB
Formato
Adobe PDF
|
7.01 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/73442