Dexterous robotic manipulation represents one of the most challenging areas of control in robotics, due to the high-dimensional, complex dynamics involved. Offline reinforcement learning provides a promising approach for training manipulation policies without requiring costly or risky online interaction with the environment. This work focuses on the application of offline reinforcement learning algorithms to robotic control tasks, specifically targeting dexterous manipulation tasks from the Adroit Hand suite, included in the D4RL datasets. Policies are trained exclusively from pre-collected datasets, without any additional online experience. Performance is assessed through online episodes in simulation. Several offline RL methods are evaluated, including Advantage-Weighted Actor-Critic (AWAC), Conservative Q-Learning (CQL), Implicit Q-Learning (IQL), and Twin Delayed Deep Deterministic Policy Gradient with Behavior Cloning (TD3+BC). A comparison with supervised learning approaches, such as Behavior Cloning (BC), is also provided to highlight the differences in performance and learning capabilities. Results show that offline RL algorithms, particularly IQL and TD3+BC, can learn robust policies from limited expert data. In parallel, a 3D-printed robotic hand has been developed as a prototype for future sim-to-real transfer.

Dexterous robotic manipulation represents one of the most challenging areas of control in robotics, due to the high-dimensional, complex dynamics involved. Offline reinforcement learning provides a promising approach for training manipulation policies without requiring costly or risky online interaction with the environment. This work focuses on the application of offline reinforcement learning algorithms to robotic control tasks, specifically targeting dexterous manipulation tasks from the Adroit Hand suite, included in the D4RL datasets. Policies are trained exclusively from pre-collected datasets, without any additional online experience. Performance is assessed through online episodes in simulation. Several offline RL methods are evaluated, including Advantage-Weighted Actor-Critic (AWAC), Conservative Q-Learning (CQL), Implicit Q-Learning (IQL), and Twin Delayed Deep Deterministic Policy Gradient with Behavior Cloning (TD3+BC). A comparison with supervised learning approaches, such as Behavior Cloning (BC), is also provided to highlight the differences in performance and learning capabilities. Results show that offline RL algorithms, particularly IQL and TD3+BC, can learn robust policies from limited expert data. In parallel, a 3D-printed robotic hand has been developed as a prototype for future sim-to-real transfer.

Offline Reinforcement Learning for Dexterous Robotic Hand Manipulation

CREMA, UMBERTO
2024/2025

Abstract

Dexterous robotic manipulation represents one of the most challenging areas of control in robotics, due to the high-dimensional, complex dynamics involved. Offline reinforcement learning provides a promising approach for training manipulation policies without requiring costly or risky online interaction with the environment. This work focuses on the application of offline reinforcement learning algorithms to robotic control tasks, specifically targeting dexterous manipulation tasks from the Adroit Hand suite, included in the D4RL datasets. Policies are trained exclusively from pre-collected datasets, without any additional online experience. Performance is assessed through online episodes in simulation. Several offline RL methods are evaluated, including Advantage-Weighted Actor-Critic (AWAC), Conservative Q-Learning (CQL), Implicit Q-Learning (IQL), and Twin Delayed Deep Deterministic Policy Gradient with Behavior Cloning (TD3+BC). A comparison with supervised learning approaches, such as Behavior Cloning (BC), is also provided to highlight the differences in performance and learning capabilities. Results show that offline RL algorithms, particularly IQL and TD3+BC, can learn robust policies from limited expert data. In parallel, a 3D-printed robotic hand has been developed as a prototype for future sim-to-real transfer.
2024
Offline Reinforcement Learning for Dexterous Robotic Hand Manipulation
Dexterous robotic manipulation represents one of the most challenging areas of control in robotics, due to the high-dimensional, complex dynamics involved. Offline reinforcement learning provides a promising approach for training manipulation policies without requiring costly or risky online interaction with the environment. This work focuses on the application of offline reinforcement learning algorithms to robotic control tasks, specifically targeting dexterous manipulation tasks from the Adroit Hand suite, included in the D4RL datasets. Policies are trained exclusively from pre-collected datasets, without any additional online experience. Performance is assessed through online episodes in simulation. Several offline RL methods are evaluated, including Advantage-Weighted Actor-Critic (AWAC), Conservative Q-Learning (CQL), Implicit Q-Learning (IQL), and Twin Delayed Deep Deterministic Policy Gradient with Behavior Cloning (TD3+BC). A comparison with supervised learning approaches, such as Behavior Cloning (BC), is also provided to highlight the differences in performance and learning capabilities. Results show that offline RL algorithms, particularly IQL and TD3+BC, can learn robust policies from limited expert data. In parallel, a 3D-printed robotic hand has been developed as a prototype for future sim-to-real transfer.
Offline RL
Robotic Manipulation
Dexterous Hands
RL Algorithms
Simulation
File in questo prodotto:
File Dimensione Formato  
Crema_Umberto.pdf

embargo fino al 24/07/2028

Dimensione 7.4 MB
Formato Adobe PDF
7.4 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/89784