In the past few years, there has been a growing focus on semantic segmentation, which involves assigning each pixel in an image to a specific label from a given set[65].The use of autoencoder architectures has been explored by numerous computer vision researchers in an attempt to develop models capable of learning both the semantics of an image and a low-level representation of it. When utilizing an autoencoder architecture, the input undergoes encoding to produce a low-dimensional representation. This representation is subsequently leveraged by a decoder to reconstruct the original data. The presented ap- proach involves a combination of convolutional neural networks (CNNs) and transformers to form an ensemble, as detailed in this work. Ensemble methods rely on multiple models being trained and utilized for classification, with the ensemble combining the outputs of individual classifiers. By capitalizing on the varying strengths of each classifier, this approach enhances the overall performance of the system. Distinct loss functions are employed to ensure diversity among the individual networks. The ensemble method em- ploys a combination of the DeepLabV3+, HarDNet, and PVT environments, with varying backbone networks. Additionally, a novel loss function is presented, which integrates the Dice and Structural Similarity Index. To assess the proposed ensemble, a comprehensive empirical evaluation is conducted on six real-world scenarios, namely polyp, skin segmen- tation, leukocyte segmentation, butterfly identification, microorganism identification, and radiology segmentation. The proposed model has achieved state-of-the-art performance on these scenarios.

In the past few years, there has been a growing focus on semantic segmentation, which involves assigning each pixel in an image to a specific label from a given set[65].The use of autoencoder architectures has been explored by numerous computer vision researchers in an attempt to develop models capable of learning both the semantics of an image and a low-level representation of it. When utilizing an autoencoder architecture, the input undergoes encoding to produce a low-dimensional representation. This representation is subsequently leveraged by a decoder to reconstruct the original data. The presented ap- proach involves a combination of convolutional neural networks (CNNs) and transformers to form an ensemble, as detailed in this work. Ensemble methods rely on multiple models being trained and utilized for classification, with the ensemble combining the outputs of individual classifiers. By capitalizing on the varying strengths of each classifier, this approach enhances the overall performance of the system. Distinct loss functions are employed to ensure diversity among the individual networks. The ensemble method em- ploys a combination of the DeepLabV3+, HarDNet, and PVT environments, with varying backbone networks. Additionally, a novel loss function is presented, which integrates the Dice and Structural Similarity Index. To assess the proposed ensemble, a comprehensive empirical evaluation is conducted on six real-world scenarios, namely polyp, skin segmen- tation, leukocyte segmentation, butterfly identification, microorganism identification, and radiology segmentation. The proposed model has achieved state-of-the-art performance on these scenarios.

An Empirical Study on Segmentation Methods with Deep Ensembles and Data Augmentation

CUZA, DANIELA
2022/2023

Abstract

In the past few years, there has been a growing focus on semantic segmentation, which involves assigning each pixel in an image to a specific label from a given set[65].The use of autoencoder architectures has been explored by numerous computer vision researchers in an attempt to develop models capable of learning both the semantics of an image and a low-level representation of it. When utilizing an autoencoder architecture, the input undergoes encoding to produce a low-dimensional representation. This representation is subsequently leveraged by a decoder to reconstruct the original data. The presented ap- proach involves a combination of convolutional neural networks (CNNs) and transformers to form an ensemble, as detailed in this work. Ensemble methods rely on multiple models being trained and utilized for classification, with the ensemble combining the outputs of individual classifiers. By capitalizing on the varying strengths of each classifier, this approach enhances the overall performance of the system. Distinct loss functions are employed to ensure diversity among the individual networks. The ensemble method em- ploys a combination of the DeepLabV3+, HarDNet, and PVT environments, with varying backbone networks. Additionally, a novel loss function is presented, which integrates the Dice and Structural Similarity Index. To assess the proposed ensemble, a comprehensive empirical evaluation is conducted on six real-world scenarios, namely polyp, skin segmen- tation, leukocyte segmentation, butterfly identification, microorganism identification, and radiology segmentation. The proposed model has achieved state-of-the-art performance on these scenarios.
2022
An Empirical Study on Segmentation Methods with Deep Ensembles and Data Augmentation
In the past few years, there has been a growing focus on semantic segmentation, which involves assigning each pixel in an image to a specific label from a given set[65].The use of autoencoder architectures has been explored by numerous computer vision researchers in an attempt to develop models capable of learning both the semantics of an image and a low-level representation of it. When utilizing an autoencoder architecture, the input undergoes encoding to produce a low-dimensional representation. This representation is subsequently leveraged by a decoder to reconstruct the original data. The presented ap- proach involves a combination of convolutional neural networks (CNNs) and transformers to form an ensemble, as detailed in this work. Ensemble methods rely on multiple models being trained and utilized for classification, with the ensemble combining the outputs of individual classifiers. By capitalizing on the varying strengths of each classifier, this approach enhances the overall performance of the system. Distinct loss functions are employed to ensure diversity among the individual networks. The ensemble method em- ploys a combination of the DeepLabV3+, HarDNet, and PVT environments, with varying backbone networks. Additionally, a novel loss function is presented, which integrates the Dice and Structural Similarity Index. To assess the proposed ensemble, a comprehensive empirical evaluation is conducted on six real-world scenarios, namely polyp, skin segmen- tation, leukocyte segmentation, butterfly identification, microorganism identification, and radiology segmentation. The proposed model has achieved state-of-the-art performance on these scenarios.
segmentation
ensembles
deep learning
data augmentation
loss function
File in questo prodotto:
File Dimensione Formato  
Cuza_Daniela.pdf

accesso aperto

Dimensione 4.65 MB
Formato Adobe PDF
4.65 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/50909