One of the most challenging problems in the field of Computer Vision is Semantic Segmentation, an high level task which assigns class labels to each pixel, producing a dense output map where each pixel is classified according to its semantic content. The greatest limitation of semantic segmentation is the need for a huge amount of labelled datasets, that are expensive, time-consuming and sometimes impossible to be collected. For this reasons, many synthetic datasets have been produced by means automatic engines to support learning phases. Due to the incapability of generalizing information that come from different domains, new strategies of Domain Adaptation have been studied with the aim to cover the domain gap. In this thesis we will propose a novel approach towards Unsupervised Domain Adaptation for Semantic Segmentation in synthetic urban environments, in order to avoid the loss of performance in low visibility conditions. Degrading factors, such as atmospheric occlusions or light variations, can strongly affect the statistical distributions of RGB samples, reducing the segmentation capability. We will introduce a novel version of the ResNet backbone, which is adapted to be compatible with multi modal data. We will describe some strategies to combine RGB and depth samples at different convolutional levels, investigating how distant maps can contribute to the learning process. In addition, we will explore some normalization methods applied to depth information, to expand small distant values and make them more dominant within the mixture of data in the latent space. Relevant improvements are introduced by the expanding disparity normalization method, which increase the capacity to discern object closer to the sensor. Moreover, with our approach, experimental results have shown an increase equal to 9.9 and 12.1 points in terms of segmentation performance with respect to the Baseline model on foggy and rainy settings.

One of the most challenging problems in the field of Computer Vision is Semantic Segmentation, an high level task which assigns class labels to each pixel, producing a dense output map where each pixel is classified according to its semantic content. The greatest limitation of semantic segmentation is the need for a huge amount of labelled datasets, that are expensive, time-consuming and sometimes impossible to be collected. For this reasons, many synthetic datasets have been produced by means automatic engines to support learning phases. Due to the incapability of generalizing information that come from different domains, new strategies of Domain Adaptation have been studied with the aim to cover the domain gap. In this thesis we will propose a novel approach towards Unsupervised Domain Adaptation for Semantic Segmentation in synthetic urban environments, in order to avoid the loss of performance in low visibility conditions. Degrading factors, such as atmospheric occlusions or light variations, can strongly affect the statistical distributions of RGB samples, reducing the segmentation capability. We will introduce a novel version of the ResNet backbone, which is adapted to be compatible with multi modal data. We will describe some strategies to combine RGB and depth samples at different convolutional levels, investigating how distant maps can contribute to the learning process. In addition, we will explore some normalization methods applied to depth information, to expand small distant values and make them more dominant within the mixture of data in the latent space. Relevant improvements are introduced by the expanding disparity normalization method, which increase the capacity to discern object closer to the sensor. Moreover, with our approach, experimental results have shown an increase equal to 9.9 and 12.1 points in terms of segmentation performance with respect to the Baseline model on foggy and rainy settings.

Semantic Segmentation from RGBD Data in the Autonomous Driving Context

ROSSETTO, LORENZO
2021/2022

Abstract

One of the most challenging problems in the field of Computer Vision is Semantic Segmentation, an high level task which assigns class labels to each pixel, producing a dense output map where each pixel is classified according to its semantic content. The greatest limitation of semantic segmentation is the need for a huge amount of labelled datasets, that are expensive, time-consuming and sometimes impossible to be collected. For this reasons, many synthetic datasets have been produced by means automatic engines to support learning phases. Due to the incapability of generalizing information that come from different domains, new strategies of Domain Adaptation have been studied with the aim to cover the domain gap. In this thesis we will propose a novel approach towards Unsupervised Domain Adaptation for Semantic Segmentation in synthetic urban environments, in order to avoid the loss of performance in low visibility conditions. Degrading factors, such as atmospheric occlusions or light variations, can strongly affect the statistical distributions of RGB samples, reducing the segmentation capability. We will introduce a novel version of the ResNet backbone, which is adapted to be compatible with multi modal data. We will describe some strategies to combine RGB and depth samples at different convolutional levels, investigating how distant maps can contribute to the learning process. In addition, we will explore some normalization methods applied to depth information, to expand small distant values and make them more dominant within the mixture of data in the latent space. Relevant improvements are introduced by the expanding disparity normalization method, which increase the capacity to discern object closer to the sensor. Moreover, with our approach, experimental results have shown an increase equal to 9.9 and 12.1 points in terms of segmentation performance with respect to the Baseline model on foggy and rainy settings.
2021
Semantic Segmentation from RGBD Data in the Autonomous Driving Context
One of the most challenging problems in the field of Computer Vision is Semantic Segmentation, an high level task which assigns class labels to each pixel, producing a dense output map where each pixel is classified according to its semantic content. The greatest limitation of semantic segmentation is the need for a huge amount of labelled datasets, that are expensive, time-consuming and sometimes impossible to be collected. For this reasons, many synthetic datasets have been produced by means automatic engines to support learning phases. Due to the incapability of generalizing information that come from different domains, new strategies of Domain Adaptation have been studied with the aim to cover the domain gap. In this thesis we will propose a novel approach towards Unsupervised Domain Adaptation for Semantic Segmentation in synthetic urban environments, in order to avoid the loss of performance in low visibility conditions. Degrading factors, such as atmospheric occlusions or light variations, can strongly affect the statistical distributions of RGB samples, reducing the segmentation capability. We will introduce a novel version of the ResNet backbone, which is adapted to be compatible with multi modal data. We will describe some strategies to combine RGB and depth samples at different convolutional levels, investigating how distant maps can contribute to the learning process. In addition, we will explore some normalization methods applied to depth information, to expand small distant values and make them more dominant within the mixture of data in the latent space. Relevant improvements are introduced by the expanding disparity normalization method, which increase the capacity to discern object closer to the sensor. Moreover, with our approach, experimental results have shown an increase equal to 9.9 and 12.1 points in terms of segmentation performance with respect to the Baseline model on foggy and rainy settings.
Segmentation
Domain Adaptation
RGBD data
Autonomous Driving
Neural Network
File in questo prodotto:
File Dimensione Formato  
Rossetto_Lorenzo.pdf

accesso riservato

Dimensione 5.73 MB
Formato Adobe PDF
5.73 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/29236