With the growing demand for precise object manipulation in complex tasks, dexterous manipulation of multi-fingered robotic hands using reinforcement learning is emerging as an increasingly compelling topic in robotics research. However, most existing algorithms require extensive interactions with the environment to learn a task, which, in real-world scenarios, often leads to irreversible events such as object slippage. In this thesis, an approach to learning in-hand manipulation skills in the presence of irreversible events is presented, based on the synergy between a model-free reinforcement learning module and a low-level reactive control module. The reactive layer consists of a slip-avoidance algorithm composed of two control actions: one based on the Coulomb friction law and the other exploiting the residual of a suitable Kalman Filter to counteract dynamic conditions and uncertainties. The learning module is based on an adapted version of the Cross Entropy (CE) policy search method that aims to fulfill the manipulation goal and minimize the intervention of the reactive layer. Two Franka Emika Panda manipulators, equipped with soft semi-spherical grippers, are used to simulate a two-fingered arm set-up. The proposed method is trained from scratch in a Robosuite simulation environment. The simulation results demonstrate that the proposed approach effectively achieves the manipulation objectives while significantly reducing the incidence of object slippage during the learning process, and the experimental results validate the effectiveness of the reactive layer.
With the growing demand for precise object manipulation in complex tasks, dexterous manipulation of multi-fingered robotic hands using reinforcement learning is emerging as an increasingly compelling topic in robotics research. However, most existing algorithms require extensive interactions with the environment to learn a task, which, in real-world scenarios, often leads to irreversible events such as object slippage. In this thesis, an approach to learning in-hand manipulation skills in the presence of irreversible events is presented, based on the synergy between a model-free reinforcement learning module and a low-level reactive control module. The reactive layer consists of a slip-avoidance algorithm composed of two control actions: one based on the Coulomb friction law and the other exploiting the residual of a suitable Kalman Filter to counteract dynamic conditions and uncertainties. The learning module is based on an adapted version of the Cross Entropy (CE) policy search method that aims to fulfill the manipulation goal and minimize the intervention of the reactive layer. Two Franka Emika Panda manipulators, equipped with soft semi-spherical grippers, are used to simulate a two-fingered arm set-up. The proposed method is trained from scratch in a Robosuite simulation environment. The simulation results demonstrate that the proposed approach effectively achieves the manipulation objectives while significantly reducing the incidence of object slippage during the learning process, and the experimental results validate the effectiveness of the reactive layer.
Design of a Model-Free Reinforcement Learning Algorithm Robust to Irreversible Events for Robotic Manipulation
ROSSI, LEONARDO
2024/2025
Abstract
With the growing demand for precise object manipulation in complex tasks, dexterous manipulation of multi-fingered robotic hands using reinforcement learning is emerging as an increasingly compelling topic in robotics research. However, most existing algorithms require extensive interactions with the environment to learn a task, which, in real-world scenarios, often leads to irreversible events such as object slippage. In this thesis, an approach to learning in-hand manipulation skills in the presence of irreversible events is presented, based on the synergy between a model-free reinforcement learning module and a low-level reactive control module. The reactive layer consists of a slip-avoidance algorithm composed of two control actions: one based on the Coulomb friction law and the other exploiting the residual of a suitable Kalman Filter to counteract dynamic conditions and uncertainties. The learning module is based on an adapted version of the Cross Entropy (CE) policy search method that aims to fulfill the manipulation goal and minimize the intervention of the reactive layer. Two Franka Emika Panda manipulators, equipped with soft semi-spherical grippers, are used to simulate a two-fingered arm set-up. The proposed method is trained from scratch in a Robosuite simulation environment. The simulation results demonstrate that the proposed approach effectively achieves the manipulation objectives while significantly reducing the incidence of object slippage during the learning process, and the experimental results validate the effectiveness of the reactive layer.File | Dimensione | Formato | |
---|---|---|---|
Rossi_Leonardo.pdf
accesso aperto
Dimensione
6.38 MB
Formato
Adobe PDF
|
6.38 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/84566