Domain adaptation is a critical challenge in the field of computer vision, and even more in dense prediction tasks like semantic segmentation. This work explores two distinct approaches to address the adaptation of semantic segmentation models to different viewpoints in a synthetic drone video dataset. The first approach relies on simulating perspective changes using homographies. By leveraging the intrinsic and extrinsic camera parameters, 2D transformations are applied to warp the RGB images and semantic segmentation ground truth between viewpoints. However, this method introduces significant distortions in 3D structures such as buildings and trees, limiting its effectiveness. To overcome these limitations, a second approach incorporates depth information captured by the camera. This multimodal technique uses depth maps to perform 3D-aware warpings, allowing for more accurate adaptations between viewpoints. This method improves segmentation performance, though the degree of improvement depends on the severity of the viewpoint change. Preliminary results indicate that while homographies provide a computationally efficient solution, they fall short in accurately adapting to 3D scene changes. Incorporating depth information offers a more robust approach, yielding better segmentation results. This study highlights the importance of leveraging 3D information for effective domain adaptation in aerial semantic segmentation and provides insights into the trade-offs between different techniques.
Domain adaptation is a critical challenge in the field of computer vision, and even more in dense prediction tasks like semantic segmentation. This work explores two distinct approaches to address the adaptation of semantic segmentation models to different viewpoints in a synthetic drone video dataset. The first approach relies on simulating perspective changes using homographies. By leveraging the intrinsic and extrinsic camera parameters, 2D transformations are applied to warp the RGB images and semantic segmentation ground truth between viewpoints. However, this method introduces significant distortions in 3D structures such as buildings and trees, limiting its effectiveness. To overcome these limitations, a second approach incorporates depth information captured by the camera. This multimodal technique uses depth maps to perform 3D-aware warpings, allowing for more accurate adaptations between viewpoints. This method improves segmentation performance, though the degree of improvement depends on the severity of the viewpoint change. Preliminary results indicate that while homographies provide a computationally efficient solution, they fall short in accurately adapting to 3D scene changes. Incorporating depth information offers a more robust approach, yielding better segmentation results. This study highlights the importance of leveraging 3D information for effective domain adaptation in aerial semantic segmentation and provides insights into the trade-offs between different techniques.
Domain Adaptation Across Aerial Viewpoints in Semantic Segmentation of Drone Images
CANADA CARRIL, JUAN
2024/2025
Abstract
Domain adaptation is a critical challenge in the field of computer vision, and even more in dense prediction tasks like semantic segmentation. This work explores two distinct approaches to address the adaptation of semantic segmentation models to different viewpoints in a synthetic drone video dataset. The first approach relies on simulating perspective changes using homographies. By leveraging the intrinsic and extrinsic camera parameters, 2D transformations are applied to warp the RGB images and semantic segmentation ground truth between viewpoints. However, this method introduces significant distortions in 3D structures such as buildings and trees, limiting its effectiveness. To overcome these limitations, a second approach incorporates depth information captured by the camera. This multimodal technique uses depth maps to perform 3D-aware warpings, allowing for more accurate adaptations between viewpoints. This method improves segmentation performance, though the degree of improvement depends on the severity of the viewpoint change. Preliminary results indicate that while homographies provide a computationally efficient solution, they fall short in accurately adapting to 3D scene changes. Incorporating depth information offers a more robust approach, yielding better segmentation results. This study highlights the importance of leveraging 3D information for effective domain adaptation in aerial semantic segmentation and provides insights into the trade-offs between different techniques.| File | Dimensione | Formato | |
|---|---|---|---|
|
Canada_Juan.pdf
accesso aperto
Dimensione
11.02 MB
Formato
Adobe PDF
|
11.02 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/83209