Domain Adaptation Across Aerial Viewpoints in Semantic Segmentation of Drone Images

Domain adaptation is a critical challenge in the field of computer vision, and even more in dense prediction tasks like semantic segmentation. This work explores two distinct approaches to address the adaptation of semantic segmentation models to different viewpoints in a synthetic drone video dataset. The first approach relies on simulating perspective changes using homographies. By leveraging the intrinsic and extrinsic camera parameters, 2D transformations are applied to warp the RGB images and semantic segmentation ground truth between viewpoints. However, this method introduces significant distortions in 3D structures such as buildings and trees, limiting its effectiveness. To overcome these limitations, a second approach incorporates depth information captured by the camera. This multimodal technique uses depth maps to perform 3D-aware warpings, allowing for more accurate adaptations between viewpoints. This method improves segmentation performance, though the degree of improvement depends on the severity of the viewpoint change. Preliminary results indicate that while homographies provide a computationally efficient solution, they fall short in accurately adapting to 3D scene changes. Incorporating depth information offers a more robust approach, yielding better segmentation results. This study highlights the importance of leveraging 3D information for effective domain adaptation in aerial semantic segmentation and provides insights into the trade-offs between different techniques.

Domain Adaptation Across Aerial Viewpoints in Semantic Segmentation of Drone Images

CANADA CARRIL, JUAN

2024/2025

Abstract

Domain adaptation is a critical challenge in the field of computer vision, and even more in dense prediction tasks like semantic segmentation. This work explores two distinct approaches to address the adaptation of semantic segmentation models to different viewpoints in a synthetic drone video dataset. The first approach relies on simulating perspective changes using homographies. By leveraging the intrinsic and extrinsic camera parameters, 2D transformations are applied to warp the RGB images and semantic segmentation ground truth between viewpoints. However, this method introduces significant distortions in 3D structures such as buildings and trees, limiting its effectiveness. To overcome these limitations, a second approach incorporates depth information captured by the camera. This multimodal technique uses depth maps to perform 3D-aware warpings, allowing for more accurate adaptations between viewpoints. This method improves segmentation performance, though the degree of improvement depends on the severity of the viewpoint change. Preliminary results indicate that while homographies provide a computationally efficient solution, they fall short in accurately adapting to 3D scene changes. Incorporating depth information offers a more robust approach, yielding better segmentation results. This study highlights the importance of leveraging 3D information for effective domain adaptation in aerial semantic segmentation and provides insights into the trade-offs between different techniques.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				ICT FOR INTERNET AND MULTIMEDIA - INGEGNERIA PER LE COMUNICAZIONI MULTIMEDIALI E INTERNET Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Domain Adaptation Across Aerial Viewpoints in Semantic Segmentation of Drone Images
			
	Abstract in italiano
	
				Domain adaptation is a critical challenge in the field of computer vision, and even more in dense prediction tasks like semantic segmentation. This work explores two distinct approaches to address the adaptation of semantic segmentation models to different viewpoints in a synthetic drone video dataset.

The first approach relies on simulating perspective changes using homographies. By leveraging the intrinsic and extrinsic camera parameters, 2D transformations are applied to warp the RGB images and semantic segmentation ground truth between viewpoints. However, this method introduces significant distortions in 3D structures such as buildings and trees, limiting its effectiveness.

To overcome these limitations, a second approach incorporates depth information captured by the camera. This multimodal technique uses depth maps to perform 3D-aware warpings, allowing for more accurate adaptations between viewpoints. This method improves segmentation performance, though the degree of improvement depends on the severity of the viewpoint change. 

Preliminary results indicate that while homographies provide a computationally efficient solution, they fall short in accurately adapting to 3D scene changes. Incorporating depth information offers a more robust approach, yielding better segmentation results. This study highlights the importance of leveraging 3D information for effective domain adaptation in aerial semantic segmentation and provides insights into the trade-offs between different techniques.
			
	Parola chiave
	
				Domain Adaptation
Semantic Segmentatio
Viewpoint
			
	Relatore
	
				ZANUTTIGH, PIETRO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Canada_Juan.pdf accesso aperto Dimensione 11.02 MB Formato Adobe PDF Visualizza/Apri	11.02 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/83209