Study on Vision Transformer Ensemble for medical image Segmentation

Deep learning has emerged as a powerful tool for medical image segmentation, crucial for accurate diagnosis and treatment planning. This thesis explores the utilization of Vision Transformer (ViT) ensembles, in addition to models used in standard approaches like Convolutional Neural Networks (CNN). We investigate the effect of combining various networks for the challenging task of medical images segmentation, in particular working on datasets focusing on samples of polyp images. The study also questions the effectiveness of differentiating the loss function during training, in relation to the strategies used. In addition to the evaluation of the impact of standard data augmentation, we have adopted a very recent and promising approach, involving the use of the Segment Anything (SAM) model, by Meta. This is used to pre-process adding insightful information by exploiting the features that this model offers given its excellent ability in generalized segmentation. Through extensive experimentation, and with due considerations, our approach demonstrates comparable performance to most of the latest methods, giving insights for more detailed studies regarding medical segmentation.

Study on Vision Transformer Ensemble for medical image Segmentation

MANFE', ALESSANDRO

2024/2025

Abstract

Deep learning has emerged as a powerful tool for medical image segmentation, crucial for accurate diagnosis and treatment planning. This thesis explores the utilization of Vision Transformer (ViT) ensembles, in addition to models used in standard approaches like Convolutional Neural Networks (CNN). We investigate the effect of combining various networks for the challenging task of medical images segmentation, in particular working on datasets focusing on samples of polyp images. The study also questions the effectiveness of differentiating the loss function during training, in relation to the strategies used. In addition to the evaluation of the impact of standard data augmentation, we have adopted a very recent and promising approach, involving the use of the Segment Anything (SAM) model, by Meta. This is used to pre-process adding insightful information by exploiting the features that this model offers given its excellent ability in generalized segmentation. Through extensive experimentation, and with due considerations, our approach demonstrates comparable performance to most of the latest methods, giving insights for more detailed studies regarding medical segmentation.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				COMPUTER ENGINEERING Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Study on Vision Transformer Ensemble for medical image Segmentation
			
	Parola chiave
	
				Segmentation
Vision Transformer
Medical Image
			
	Relatore
	
				NANNI, LORIS
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Manfè_Alessandro.pdf Accesso riservato Dimensione 12.89 MB Formato Adobe PDF	12.89 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/82088