Multimodal Domain Adaptation for Point Cloud Semantic Segmentation

Semantic segmentation, thanks to multimodal datasets, can be made more reliable and accurate by mixing heterogeneous data, e.g. RGB images with LIDAR sequences or depth maps. However, it requires a large amount of annotated data to train neural networks able to solve the task effectively. Obtaining accurate and complete annotations for every image and LIDAR sample in a dataset (i.e. a label for each pixel or 3D point) can be a significant challenge, particularly when dealing with complex problems. To tackle the lack of annotation, Unsupervised Domain Adaptation (UDA) was studied and developed in recent years assuming labels are not available for the target dataset and supervising the training of the model using the labels of the source dataset. Using multimodal data in the UDA context is challenging since the different domains can be impacted differently by domain shifts. In this work, the exploration of autonomous driving datasets is undertaken to address the challenge of developing UDA techniques with a specific emphasis on adapting between RGB and Depth (RGB-D) and RGB-Lidar data. For all the datasets considered, the information from 2D images and 3D data (RGB-D or LIDAR-RGB) were used. The results demonstrate the effectiveness of adapting between different datasets and highlight how the accuracy of 3D data can be enhanced by leveraging the information provided by the images.

Multimodal Domain Adaptation for Point Cloud Semantic Segmentation

NICOLETTI, GIANPIETRO

2022/2023

Abstract

Semantic segmentation, thanks to multimodal datasets, can be made more reliable and accurate by mixing heterogeneous data, e.g. RGB images with LIDAR sequences or depth maps. However, it requires a large amount of annotated data to train neural networks able to solve the task effectively. Obtaining accurate and complete annotations for every image and LIDAR sample in a dataset (i.e. a label for each pixel or 3D point) can be a significant challenge, particularly when dealing with complex problems. To tackle the lack of annotation, Unsupervised Domain Adaptation (UDA) was studied and developed in recent years assuming labels are not available for the target dataset and supervising the training of the model using the labels of the source dataset. Using multimodal data in the UDA context is challenging since the different domains can be impacted differently by domain shifts. In this work, the exploration of autonomous driving datasets is undertaken to address the challenge of developing UDA techniques with a specific emphasis on adapting between RGB and Depth (RGB-D) and RGB-Lidar data. For all the datasets considered, the information from 2D images and 3D data (RGB-D or LIDAR-RGB) were used. The results demonstrate the effectiveness of adapting between different datasets and highlight how the accuracy of 3D data can be enhanced by leveraging the information provided by the images.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				ICT FOR INTERNET AND MULTIMEDIA - INGEGNERIA PER LE COMUNICAZIONI MULTIMEDIALI E INTERNET Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2022
			
	Titolo inglese
	
				Multimodal Domain Adaptation for Point Cloud Semantic Segmentation
			
	Abstract in italiano
	
				Semantic segmentation, thanks to multimodal datasets, can be made more reliable and accurate by mixing heterogeneous data, e.g. RGB images with LIDAR sequences or depth maps. However, it requires a large amount of annotated data to train neural networks able to solve the task effectively. Obtaining accurate and complete annotations for every image and LIDAR sample in a dataset (i.e. a label for each pixel or 3D point) can be a significant challenge, particularly when dealing with complex problems. To tackle the lack of annotation, Unsupervised Domain Adaptation (UDA) was studied and developed in recent years assuming labels are not available for the target dataset and supervising the training of the model using the labels of the source dataset. 
Using multimodal data in the UDA context is challenging since the different domains can be impacted differently by domain shifts.
In this work, the exploration of autonomous driving datasets is undertaken to address the challenge of developing UDA techniques with a specific emphasis on adapting between RGB and Depth (RGB-D) and RGB-Lidar data. For all the datasets considered, the information from 2D images and 3D data (RGB-D or LIDAR-RGB) were used. The results demonstrate the effectiveness of adapting between different datasets and highlight how the accuracy of 3D data can be enhanced by leveraging the information provided by the images.
			
	Parola chiave
	
				Domain Adaptation
Segmentation
Point Clouds
Multimodal
Deep Learning
			
	Relatore
	
				ZANUTTIGH, PIETRO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Nicoletti_Gianpietro.pdf embargo fino al 25/10/2024 Dimensione 10.57 MB Formato Adobe PDF	10.57 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/56237