Recent progress in text-to-image and image-to-image generation models have resulted in the development of sophisticated systems capable of generating highly realistic and detailed images. These models have become more accessible, enabling researchers to explore innovative techniques for enhanced manipulation and control. This thesis delves into the potential of Latent Diffusion Models (LDMs), specifically employing some exemplar-driven generation pipeline to address data scarcity issues in semantic segmentation tasks, within the waste sorting domain. The focus is on presenting results obtained from training the semantic model on synthetically generated images. My findings indicate that exemplar-driven augmentation stands out as a competitive technique, particularly beneficial in scenarios with limited data, where scarcity is a challenge. However, it is important to note that a significant gap still exists between synthetic and real images. These results emphasize the potential of zero-shot learning approaches, providing avenues to alleviate or eliminate the costs associated with creating extensive datasets while enhancing the capabilities of computer vision models. The code used in this thesis can be accessed at https://github.com/lucasambin/DiffuWaste
DiffuWaste: Data Augmentation using Diffusion Model for Waste Semantic Segmentation
SAMBIN, LUCA
2023/2024
Abstract
Recent progress in text-to-image and image-to-image generation models have resulted in the development of sophisticated systems capable of generating highly realistic and detailed images. These models have become more accessible, enabling researchers to explore innovative techniques for enhanced manipulation and control. This thesis delves into the potential of Latent Diffusion Models (LDMs), specifically employing some exemplar-driven generation pipeline to address data scarcity issues in semantic segmentation tasks, within the waste sorting domain. The focus is on presenting results obtained from training the semantic model on synthetically generated images. My findings indicate that exemplar-driven augmentation stands out as a competitive technique, particularly beneficial in scenarios with limited data, where scarcity is a challenge. However, it is important to note that a significant gap still exists between synthetic and real images. These results emphasize the potential of zero-shot learning approaches, providing avenues to alleviate or eliminate the costs associated with creating extensive datasets while enhancing the capabilities of computer vision models. The code used in this thesis can be accessed at https://github.com/lucasambin/DiffuWasteFile | Dimensione | Formato | |
---|---|---|---|
Sambin_Luca.pdf
embargo fino al 06/09/2025
Dimensione
3.86 MB
Formato
Adobe PDF
|
3.86 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/62375