Previous work has shown that using web images within the framework of Current Weakly-Supervised Incremental Learning for Semantic Segmentation (WILSS) can effectively retrieve images of previously seen classes and learn new classes, achieving state-of-the-art performance. In particular, we propose an architecture that relies on vision-language models to extract captions from images, which are useful for searching the web for images of both old and new classes. This thesis aims to develop new and advanced techniques to filter web images to ensure they are consistent with the search queries.
Web-based Continual Learning for Semantic Segmentation with Vision-Language Guidance
ZANE, FRANCESCO
2023/2024
Abstract
Previous work has shown that using web images within the framework of Current Weakly-Supervised Incremental Learning for Semantic Segmentation (WILSS) can effectively retrieve images of previously seen classes and learn new classes, achieving state-of-the-art performance. In particular, we propose an architecture that relies on vision-language models to extract captions from images, which are useful for searching the web for images of both old and new classes. This thesis aims to develop new and advanced techniques to filter web images to ensure they are consistent with the search queries.File | Dimensione | Formato | |
---|---|---|---|
Zane_Francesco.pdf
accesso aperto
Dimensione
6.45 MB
Formato
Adobe PDF
|
6.45 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/74202