The ability to estimate the 6D pose of objects, which involves determining their position and orientation with respect to a reference system, is becoming increasingly crucial in many fields, such as robotics, for object manipulation, and autonomous driving, for tracking surrounding vehicles. In recent years, the development of techniques for 6D pose estimation has seen significant progress, especially due to the advent of Deep Learning and research on neural networks, which have led to the surpassing of traditional methods based on feature-matching and template-matching. However, challenges remain due to the vast number of factors complicating this task. This thesis provides a general overview of 6D pose estimation techniques, highlighting the strengths and limitations of various methods. Particular attention has been given to the estimation of the pose of objects not present during the neural network training phase (unseen objects). This is a critical aspect, given that the immense variety of objects in the real world requires pose estimation systems to generalize effectively. The analysis of the BOP Challenge 2024, an international competition that selects the best methods for 6D object pose estimation, has highlighted the most advanced methods, emphasizing the innovative use of Large Language Models by FoundationPose to improve the accuracy of 6D pose estimation.
La capacità di stimare la posa 6D degli oggetti, ovvero determinarne posizione e orientamento rispetto a un sistema di riferimento, sta diventando sempre più cruciale in molti settori, come la robotica, per la manipolazione degli oggetti, e la guida autonoma, per il tracciamento dei veicoli circostanti. Negli ultimi anni, lo sviluppo delle tecniche per la stima della posa 6D ha registrato progressi significativi, soprattutto grazie all’avvento del Deep Learning e alla ricerca sulle reti neurali, che hanno portato al superamento di metodi tradizionali basati su feature-matching e template-matching. Tuttavia, permangono sfide dovute all’enorme quantità di fattori che complicano questo compito. Questa tesi fornisce una panoramica generale delle tecniche di stima della posa 6D, evidenziando i punti di forza e i limiti dei diversi metodi. Particolare attenzione è stata dedicata alla stima della posa di oggetti non presenti nella fase di addestramento delle reti neurali. Questo è un aspetto critico, dato che l’immensa varietà di oggetti nel mondo reale richiede che i sistemi di stima della posa siano in grado di generalizzare efficacemente. L’analisi della BOP Challenge 2024, una competizione internazionale che seleziona i migliori metodi per la stima della posa 6D di oggetti, ha messo in luce i metodi più avanzati, evidenziando l’uso innovativo dei Large Language Models da parte di FoundationPose per migliorare l’accuratezza della stima della posa 6D.
Metodologie di stima della posa 6D di oggetti
PASINATO, ALBERTO
2023/2024
Abstract
The ability to estimate the 6D pose of objects, which involves determining their position and orientation with respect to a reference system, is becoming increasingly crucial in many fields, such as robotics, for object manipulation, and autonomous driving, for tracking surrounding vehicles. In recent years, the development of techniques for 6D pose estimation has seen significant progress, especially due to the advent of Deep Learning and research on neural networks, which have led to the surpassing of traditional methods based on feature-matching and template-matching. However, challenges remain due to the vast number of factors complicating this task. This thesis provides a general overview of 6D pose estimation techniques, highlighting the strengths and limitations of various methods. Particular attention has been given to the estimation of the pose of objects not present during the neural network training phase (unseen objects). This is a critical aspect, given that the immense variety of objects in the real world requires pose estimation systems to generalize effectively. The analysis of the BOP Challenge 2024, an international competition that selects the best methods for 6D object pose estimation, has highlighted the most advanced methods, emphasizing the innovative use of Large Language Models by FoundationPose to improve the accuracy of 6D pose estimation.File | Dimensione | Formato | |
---|---|---|---|
Pasinato_Alberto.pdf
accesso riservato
Dimensione
3.88 MB
Formato
Adobe PDF
|
3.88 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/67657