The aim of this thesis is to study and analyze the problem of 6D pose estimation in the context of industrial bin picking, one of the most challenging applications in today’s robotics. Estimating the position and the orientation of objects from visual input data is a fundamental prerequisite in autonomous robotic manipulation, yet it is a very difficult task to accomplish, due to problematic factors like reflective surfaces, occlusions, symmetrical objects and cluttered environments commonly found in industrial scenarios. The 6D Pose of a generic object in space can be fully described by means of a translation vector and a rotation matrix, defining its position and orientation of the object, with respect to the camera coordinate reference frame. By leveraging depth information from RGB-D images, the process can be improved and achieve higher precision and robustness, enabling more reliable grasping strategies. After an extensive review of the state-of-art in deep learning-based 6D pose estimation, with main focus on approaches which were developed for BOP Challenge, the work first categorizes existing methods into "Seen" and "Unseen" objects frameworks, further distinguishing between CAD-based and CAD-free solutions. Among these, the Geometry Guided Direct Regression Network Plus-Plus (GDRNPP), winner of the 2022 edition, was identified as the most suitable model for industrial applications. Finally, this Master thesis proposes a complete pipeline for an industrial bin-picking task, which consists of four main stages: detection, localization/pose estimation, depth refinement, grasp synthesis. This research also investigates how such technologies can be adapted, extended and applied to real-world industrial environments, evaluating its performance on an industrial dataset (ITODD) to link academic research and practical applications.
.Lo scopo di questa tesi è quello di studiare ed analizzare il problema della stima della posa 6D nel contesto del bin picking industriale, una delle applicazioni più impegnative nella robotica moderna. Stimare la posizione e l'orientamento di oggetti, da dati visivi in input, è un prerequisito fondamentale nella manipolazione robotica autonoma, ma è uno dei compiti più difficili da portare a termine, per fattori problematici come la presenza di superfici riflettenti, occlusioni, simmetria degli oggetti e scenari molto disordinati, molto comuni in ambito industriale. La posa 6D di un oggetto generico nello spazio può essere pienamente descritta da un vettore di traslazione ed una matrice di rotazione, definendo così la posizione e l'orientamento dell'oggetto rispetto al sistema di coordinate della fotocamera. Sfruttando l'informazione sulla profondità delle immagini RGB-D, il processo può essere migliorato ulteriormente ottenendo così precisione e robustezza ancora più marcate, permettendo strategie di grasping più affidabili. Dopo una estesa revisione dello stato dell'arte della 6D pose estimation in ambito deep-learning, con principale attenzione ad approcci sviluppati per la BOP Challenge, il lavoro prima categorizza metodi esistenti in approcci per "Oggetti già visti" e "Oggetti mai visti", distinguendo ulteriormente soluzioni che si basano su modelli CAD o meno. Tra questi, Geometry Guided Direct Regression Network Plus-Plus Geometry-guided Direct Regression Network Plus-Plus (GDRNPP), vincitore della edizione 2022, è stato selezionato come il modello che meglio si adatta ad applicazioni industriali. Infine questa Tesi Magistrale vuole proporre un'intera pipeline per bin-picking industriale, che consiste principalmente in quattro fasi: rilevamento, localizzazione/stima della posa, raffinamento della posa, e generazione di prese. Questa ricerca studia anche come queste tecnologie possono essere adattate ed applicate ad ambienti industriali reali, valutando le prestazioni su un dataset industriale (ITODD) connettendo la ricerca accademica alle applicazioni pratiche.
6D Object Pose Estimation for Industrial Bin Picking Applications
GRANIELLO, CARMINE
2024/2025
Abstract
The aim of this thesis is to study and analyze the problem of 6D pose estimation in the context of industrial bin picking, one of the most challenging applications in today’s robotics. Estimating the position and the orientation of objects from visual input data is a fundamental prerequisite in autonomous robotic manipulation, yet it is a very difficult task to accomplish, due to problematic factors like reflective surfaces, occlusions, symmetrical objects and cluttered environments commonly found in industrial scenarios. The 6D Pose of a generic object in space can be fully described by means of a translation vector and a rotation matrix, defining its position and orientation of the object, with respect to the camera coordinate reference frame. By leveraging depth information from RGB-D images, the process can be improved and achieve higher precision and robustness, enabling more reliable grasping strategies. After an extensive review of the state-of-art in deep learning-based 6D pose estimation, with main focus on approaches which were developed for BOP Challenge, the work first categorizes existing methods into "Seen" and "Unseen" objects frameworks, further distinguishing between CAD-based and CAD-free solutions. Among these, the Geometry Guided Direct Regression Network Plus-Plus (GDRNPP), winner of the 2022 edition, was identified as the most suitable model for industrial applications. Finally, this Master thesis proposes a complete pipeline for an industrial bin-picking task, which consists of four main stages: detection, localization/pose estimation, depth refinement, grasp synthesis. This research also investigates how such technologies can be adapted, extended and applied to real-world industrial environments, evaluating its performance on an industrial dataset (ITODD) to link academic research and practical applications.| File | Dimensione | Formato | |
|---|---|---|---|
|
Graniello_Carmine.pdf
embargo fino al 27/11/2026
Dimensione
9.47 MB
Formato
Adobe PDF
|
9.47 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/98769