Considering the heterogeneous underwater acoustic transmission context, detecting and distinguishing vocalizations of cetaceans has been a challenging area of recent interest. A promising venue to improve current detection systems is constituted by machine learning algorithms. In particular, Convolutional Neural Networks (CNNs) are considered one of the most promising deep learning techniques, since they have already excelled in problems involving the automatic processing of biological sounds. Human-annotated spectrograms can be used to teach CNNs how to distinguish between information in the time-frequency domain, thus enabling the detection and classification of marine mammal sounds. However, despite these promising capabilities machine learning suffers from a lack of labeled data, which calls for the adoption of transfer learning to create accurate models even when the availability of human taggers is limited. In this thesis, we developed a dolphin whistle detection framework based on deep learning models. In particular, we investigated the performance of large-scale pre-trained models (VGG16) and compared it with the performance of a vanilla Convolutional Neural Network and several baselines (logistic regression and Support Vector Machines). The pre-trained VGG16 model achieved the best detection performance, with an accuracy of 98,9\% on a left-out test dataset.

Considering the heterogeneous underwater acoustic transmission context, detecting and distinguishing vocalizations of cetaceans has been a challenging area of recent interest. A promising venue to improve current detection systems is constituted by machine learning algorithms. In particular, Convolutional Neural Networks (CNNs) are considered one of the most promising deep learning techniques, since they have already excelled in problems involving the automatic processing of biological sounds. Human-annotated spectrograms can be used to teach CNNs how to distinguish between information in the time-frequency domain, thus enabling the detection and classification of marine mammal sounds. However, despite these promising capabilities machine learning suffers from a lack of labeled data, which calls for the adoption of transfer learning to create accurate models even when the availability of human taggers is limited. In this thesis, we developed a dolphin whistle detection framework based on deep learning models. In particular, we investigated the performance of large-scale pre-trained models (VGG16) and compared it with the performance of a vanilla Convolutional Neural Network and several baselines (logistic regression and Support Vector Machines). The pre-trained VGG16 model achieved the best detection performance, with an accuracy of 98,9\% on a left-out test dataset.

Deep learning techniques for biological signal processing: Automatic detection of dolphin sounds

KORKMAZ, BURLA NUR
2021/2022

Abstract

Considering the heterogeneous underwater acoustic transmission context, detecting and distinguishing vocalizations of cetaceans has been a challenging area of recent interest. A promising venue to improve current detection systems is constituted by machine learning algorithms. In particular, Convolutional Neural Networks (CNNs) are considered one of the most promising deep learning techniques, since they have already excelled in problems involving the automatic processing of biological sounds. Human-annotated spectrograms can be used to teach CNNs how to distinguish between information in the time-frequency domain, thus enabling the detection and classification of marine mammal sounds. However, despite these promising capabilities machine learning suffers from a lack of labeled data, which calls for the adoption of transfer learning to create accurate models even when the availability of human taggers is limited. In this thesis, we developed a dolphin whistle detection framework based on deep learning models. In particular, we investigated the performance of large-scale pre-trained models (VGG16) and compared it with the performance of a vanilla Convolutional Neural Network and several baselines (logistic regression and Support Vector Machines). The pre-trained VGG16 model achieved the best detection performance, with an accuracy of 98,9\% on a left-out test dataset.
2021
Deep learning techniques for biological signal processing: Automatic detection of dolphin sounds
Considering the heterogeneous underwater acoustic transmission context, detecting and distinguishing vocalizations of cetaceans has been a challenging area of recent interest. A promising venue to improve current detection systems is constituted by machine learning algorithms. In particular, Convolutional Neural Networks (CNNs) are considered one of the most promising deep learning techniques, since they have already excelled in problems involving the automatic processing of biological sounds. Human-annotated spectrograms can be used to teach CNNs how to distinguish between information in the time-frequency domain, thus enabling the detection and classification of marine mammal sounds. However, despite these promising capabilities machine learning suffers from a lack of labeled data, which calls for the adoption of transfer learning to create accurate models even when the availability of human taggers is limited. In this thesis, we developed a dolphin whistle detection framework based on deep learning models. In particular, we investigated the performance of large-scale pre-trained models (VGG16) and compared it with the performance of a vanilla Convolutional Neural Network and several baselines (logistic regression and Support Vector Machines). The pre-trained VGG16 model achieved the best detection performance, with an accuracy of 98,9\% on a left-out test dataset.
CNN
Underwater SP
Transfer learning
Remote sensing
Dolphin sounds
File in questo prodotto:
File Dimensione Formato  
Korkmaz_Burla_Nur.pdf

accesso aperto

Dimensione 2.8 MB
Formato Adobe PDF
2.8 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/31586