Exploiting Retentive Networks in 3D LiDAR Semantic Segmentation

The advent of LiDAR technology has revolutionized the fields of autonomous driving, robotics, and environmental monitoring by providing precise 3D point cloud data. Semantic segmentation of LiDAR point clouds is an essential task for understanding the environment and facilitating intelligent decision-making in these applications. This Master's thesis introduces a cutting-edge approach, termed RangeRet, for 3D LiDAR Semantic Segmentation, which leverages the potential of range images to achieve real-time performance. It also exploits the Retentive Networks, a novel Natural Language Processing (NLP) architecture designed for Large Language Models. Through a comprehensive study of semantic segmentation in the context of 3D LiDAR data and state-of-the-art methods, the proposed approach introduces a lightweight network crafted to address key limitations in existing methods. It emphasizes efficiency, memory management, and real-time usage while achieving promising results. The implementation of the Retentive Networks in computer vision tasks serves as an alternative to the established Transformers architecture, aiming to better capture geometric and spatial information in two-dimensional objects. The evaluation of the proposed method on benchmark datasets, such as SemanticKITTI, involves analyzing accuracy, efficiency, and generalization across diverse scenarios. In the final section, the thesis conducts an ablation study, systematically dissecting the proposed method by isolating and evaluating individual components. This process identifies the contribution of each architectural element, provides insights into the network's robustness, and highlights key factors influencing performance.

Exploiting Retentive Networks in 3D LiDAR Semantic Segmentation

MOSCO, SIMONE

2022/2023

Abstract

The advent of LiDAR technology has revolutionized the fields of autonomous driving, robotics, and environmental monitoring by providing precise 3D point cloud data. Semantic segmentation of LiDAR point clouds is an essential task for understanding the environment and facilitating intelligent decision-making in these applications. This Master's thesis introduces a cutting-edge approach, termed RangeRet, for 3D LiDAR Semantic Segmentation, which leverages the potential of range images to achieve real-time performance. It also exploits the Retentive Networks, a novel Natural Language Processing (NLP) architecture designed for Large Language Models. Through a comprehensive study of semantic segmentation in the context of 3D LiDAR data and state-of-the-art methods, the proposed approach introduces a lightweight network crafted to address key limitations in existing methods. It emphasizes efficiency, memory management, and real-time usage while achieving promising results. The implementation of the Retentive Networks in computer vision tasks serves as an alternative to the established Transformers architecture, aiming to better capture geometric and spatial information in two-dimensional objects. The evaluation of the proposed method on benchmark datasets, such as SemanticKITTI, involves analyzing accuracy, efficiency, and generalization across diverse scenarios. In the final section, the thesis conducts an ablation study, systematically dissecting the proposed method by isolating and evaluating individual components. This process identifies the contribution of each architectural element, provides insights into the network's robustness, and highlights key factors influencing performance.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				COMPUTER ENGINEERING Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2022
			
	Titolo inglese
	
				Exploiting Retentive Networks in 3D LiDAR Semantic Segmentation
			
	Abstract in italiano
	
				The advent of LiDAR technology has revolutionized the fields of autonomous driving, robotics, and environmental monitoring by providing precise 3D point cloud data. Semantic segmentation of LiDAR point clouds is an essential task for understanding the environment and facilitating intelligent decision-making in these applications. This Master's thesis introduces a cutting-edge approach, termed RangeRet, for 3D LiDAR Semantic Segmentation, which leverages the potential of range images to achieve real-time performance. It also exploits the Retentive Networks, a novel Natural Language Processing (NLP) architecture designed for Large Language Models. Through a comprehensive study of semantic segmentation in the context of 3D LiDAR data and state-of-the-art methods, the proposed approach introduces a lightweight network crafted to address key limitations in existing methods. It emphasizes efficiency, memory management, and real-time usage while achieving promising results. The implementation of the Retentive Networks in computer vision tasks serves as an alternative to the established Transformers architecture, aiming to better capture geometric and spatial information in two-dimensional objects. The evaluation of the proposed method on benchmark datasets, such as SemanticKITTI, involves analyzing accuracy, efficiency, and generalization across diverse scenarios. In the final section, the thesis conducts an ablation study, systematically dissecting the proposed method by isolating and evaluating individual components. This process identifies the contribution of each architectural element, provides insights into the network's robustness, and highlights key factors influencing performance.
			
	Parola chiave
	
				3D Segmentation
Retentive Networks
Range imaging
			
	Relatore
	
				PRETTO, ALBERTO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Mosco_Simone.pdf Open Access dal 14/12/2024 Dimensione 23.5 MB Formato Adobe PDF Visualizza/Apri	23.5 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/60407