This work introduces a novel, envelope derivative-based method to detect reverse audio sections present in an audio document. The proposed method was born to be incorporated in the MPAI/IEEE-CAE ARP standard. The Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) Context-based Audio Enhancement (CAE) Audio Recording Preservation (ARP) standard provides the technical specifications for a comprehensive framework for digitizing and preserving analog audio, specifically focusing on documents recorded on open-reel tapes. The primary objective of this project was that of developing a method to automatically identify segments of audio recorded in reverse, in order to use the algorithm during the digitization process of open-reel magnetic tapes. Leveraging advanced derivative-based signal processing algorithms, the system enhances its capability to detect such reversed sections, thereby reducing errors during the analog-to-digital (A/D) conversion. This feature not only aids in identifying and correcting digitization errors but also improves the efficiency of large-scale audio document digitization projects. The system's performance has been evaluated using a diverse dataset encompassing various musical genres and digitized tapes, demonstrating its effectiveness across different types of audio content.
This work introduces a novel, envelope derivative-based method to detect reverse audio sections present in an audio document. The proposed method was born to be incorporated in the MPAI/IEEE-CAE ARP standard. The Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) Context-based Audio Enhancement (CAE) Audio Recording Preservation (ARP) standard provides the technical specifications for a comprehensive framework for digitizing and preserving analog audio, specifically focusing on documents recorded on open-reel tapes. The primary objective of this project was that of developing a method to automatically identify segments of audio recorded in reverse, in order to use the algorithm during the digitization process of open-reel magnetic tapes. Leveraging advanced derivative-based signal processing algorithms, the system enhances its capability to detect such reversed sections, thereby reducing errors during the analog-to-digital (A/D) conversion. This feature not only aids in identifying and correcting digitization errors but also improves the efficiency of large-scale audio document digitization projects. The system's performance has been evaluated using a diverse dataset encompassing various musical genres and digitized tapes, demonstrating its effectiveness across different types of audio content.
A new derivative-based model for the automatic detection of time-reversed audio in the MPAI/IEEE-CAE ARP international standard
ZANINI, FABIO
2023/2024
Abstract
This work introduces a novel, envelope derivative-based method to detect reverse audio sections present in an audio document. The proposed method was born to be incorporated in the MPAI/IEEE-CAE ARP standard. The Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) Context-based Audio Enhancement (CAE) Audio Recording Preservation (ARP) standard provides the technical specifications for a comprehensive framework for digitizing and preserving analog audio, specifically focusing on documents recorded on open-reel tapes. The primary objective of this project was that of developing a method to automatically identify segments of audio recorded in reverse, in order to use the algorithm during the digitization process of open-reel magnetic tapes. Leveraging advanced derivative-based signal processing algorithms, the system enhances its capability to detect such reversed sections, thereby reducing errors during the analog-to-digital (A/D) conversion. This feature not only aids in identifying and correcting digitization errors but also improves the efficiency of large-scale audio document digitization projects. The system's performance has been evaluated using a diverse dataset encompassing various musical genres and digitized tapes, demonstrating its effectiveness across different types of audio content.File | Dimensione | Formato | |
---|---|---|---|
Zanini_Fabio.pdf
accesso aperto
Dimensione
2.48 MB
Formato
Adobe PDF
|
2.48 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/75160