3D object detection involves identifying and localizing objects within a three-dimensional environment, using data from sensors such as cameras, LiDAR, or radar. This is a challenging task with critical applications in fields like robotics, autonomous driving, and augmented reality. Recently, transformer-based multi-modal models have gained popularity for 3D object detection due to their ability to capture complex relationships across diverse inputs. However, their end-to-end nature often leads to reduced explainability, posing challenges in safety-critical domains where it is essential for stakeholders to understand the model’s decision-making process. This thesis addresses the growing demand for transparency, interpretability, and trust in AI-driven perception systems by introducing tools that offer insight into how these advanced detectors make their predictions.
Explainable Multi-Modal 3D Object Detection
SHARIFI, SHAYAN
2024/2025
Abstract
3D object detection involves identifying and localizing objects within a three-dimensional environment, using data from sensors such as cameras, LiDAR, or radar. This is a challenging task with critical applications in fields like robotics, autonomous driving, and augmented reality. Recently, transformer-based multi-modal models have gained popularity for 3D object detection due to their ability to capture complex relationships across diverse inputs. However, their end-to-end nature often leads to reduced explainability, posing challenges in safety-critical domains where it is essential for stakeholders to understand the model’s decision-making process. This thesis addresses the growing demand for transparency, interpretability, and trust in AI-driven perception systems by introducing tools that offer insight into how these advanced detectors make their predictions.| File | Dimensione | Formato | |
|---|---|---|---|
|
Sharifi_Shayan.pdf
embargo fino al 10/09/2026
Dimensione
17.73 MB
Formato
Adobe PDF
|
17.73 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/90310