Previous work has shown that using web images within the framework of Current Weakly-Supervised Incremental Learning for Semantic Segmentation (WILSS) can effectively retrieve images of previously seen classes and learn new classes, achieving state-of-the-art performance. In particular, we propose an architecture that relies on vision-language models to extract captions from images, which are useful for searching the web for images of both old and new classes. This thesis aims to develop new and advanced techniques to filter web images to ensure they are consistent with the search queries.

Web-based Continual Learning for Semantic Segmentation with Vision-Language Guidance

ZANE, FRANCESCO
2023/2024

Abstract

Previous work has shown that using web images within the framework of Current Weakly-Supervised Incremental Learning for Semantic Segmentation (WILSS) can effectively retrieve images of previously seen classes and learn new classes, achieving state-of-the-art performance. In particular, we propose an architecture that relies on vision-language models to extract captions from images, which are useful for searching the web for images of both old and new classes. This thesis aims to develop new and advanced techniques to filter web images to ensure they are consistent with the search queries.
2023
Web-based Continual Learning for Semantic Segmentation with Vision-Language Guidance
Computer vision
NLP
Image segmentation
Continual learning
Deep learning
File in questo prodotto:
File Dimensione Formato  
Zane_Francesco.pdf

accesso aperto

Dimensione 6.45 MB
Formato Adobe PDF
6.45 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/74202