This thesis explores the automated generation and classification of short sound messages, referred to as TIScode, for the Internet of Audio Things (IoAuT). A TIScode is a brief audio sequence, lasting 4-5 seconds, carrying digital information that can be recognized by a specific smartphone application. The work focuses on developing methodologies for the various stages of the TIScode pipeline, including generation, transmission, and ultimately, reception and decoding. For the generation phase, MusicGen, a state-of-the-art autoregressive transformer model, is proposed, along with an analysis of its degrees of freedom to maximize the variety of distinct audio tracks that can be generated. Additionally, a channel coding system based on the quantization of sound features and certain high-level features extracted through convolutional neural networks (CNNs) is introduced. These features are mapped to create a unique bitmap for each TIScode, simplifying the decoding process. An algorithm is presented for the recognition phase, combining sound feature analysis with frequency-based peak analysis to enhance detection accuracy. Experimental results, obtained through simulations and field tests, demonstrate the effectiveness of the system in retrieving the digital information encoded within sound messages.

This thesis explores the automated generation and classification of short sound messages, referred to as TIScode, for the Internet of Audio Things (IoAuT). A TIScode is a brief audio sequence, lasting 4-5 seconds, carrying digital information that can be recognized by a specific smartphone application. The work focuses on developing methodologies for the various stages of the TIScode pipeline, including generation, transmission, and ultimately, reception and decoding. For the generation phase, MusicGen, a state-of-the-art autoregressive transformer model, is proposed, along with an analysis of its degrees of freedom to maximize the variety of distinct audio tracks that can be generated. Additionally, a channel coding system based on the quantization of sound features and certain high-level features extracted through convolutional neural networks (CNNs) is introduced. These features are mapped to create a unique bitmap for each TIScode, simplifying the decoding process. An algorithm is presented for the recognition phase, combining sound feature analysis with frequency-based peak analysis to enhance detection accuracy. Experimental results, obtained through simulations and field tests, demonstrate the effectiveness of the system in retrieving the digital information encoded within sound messages.

AI driven generation and classification of short sound messages for Internet of Audio Things

FAVERO, MANUELE
2023/2024

Abstract

This thesis explores the automated generation and classification of short sound messages, referred to as TIScode, for the Internet of Audio Things (IoAuT). A TIScode is a brief audio sequence, lasting 4-5 seconds, carrying digital information that can be recognized by a specific smartphone application. The work focuses on developing methodologies for the various stages of the TIScode pipeline, including generation, transmission, and ultimately, reception and decoding. For the generation phase, MusicGen, a state-of-the-art autoregressive transformer model, is proposed, along with an analysis of its degrees of freedom to maximize the variety of distinct audio tracks that can be generated. Additionally, a channel coding system based on the quantization of sound features and certain high-level features extracted through convolutional neural networks (CNNs) is introduced. These features are mapped to create a unique bitmap for each TIScode, simplifying the decoding process. An algorithm is presented for the recognition phase, combining sound feature analysis with frequency-based peak analysis to enhance detection accuracy. Experimental results, obtained through simulations and field tests, demonstrate the effectiveness of the system in retrieving the digital information encoded within sound messages.
2023
AI driven generation and classification of short sound messages for Internet of Audio Things
This thesis explores the automated generation and classification of short sound messages, referred to as TIScode, for the Internet of Audio Things (IoAuT). A TIScode is a brief audio sequence, lasting 4-5 seconds, carrying digital information that can be recognized by a specific smartphone application. The work focuses on developing methodologies for the various stages of the TIScode pipeline, including generation, transmission, and ultimately, reception and decoding. For the generation phase, MusicGen, a state-of-the-art autoregressive transformer model, is proposed, along with an analysis of its degrees of freedom to maximize the variety of distinct audio tracks that can be generated. Additionally, a channel coding system based on the quantization of sound features and certain high-level features extracted through convolutional neural networks (CNNs) is introduced. These features are mapped to create a unique bitmap for each TIScode, simplifying the decoding process. An algorithm is presented for the recognition phase, combining sound feature analysis with frequency-based peak analysis to enhance detection accuracy. Experimental results, obtained through simulations and field tests, demonstrate the effectiveness of the system in retrieving the digital information encoded within sound messages.
InternetOfAudioThing
Signal Processing
Generative Models
Classification Model
InformationSound
File in questo prodotto:
File Dimensione Formato  
Favero_Manuele.pdf

accesso aperto

Dimensione 7.01 MB
Formato Adobe PDF
7.01 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/73442