In modern industries, robots are widely used for repetitive tasks, with their performance largely dependent on the precision of their controllers to follow pre-defined trajectories. However, their ability to operate autonomously in complex, unstructured environments, particularly in tasks involving contact-rich manipulation, remains limited. Enhancing robots with the capability to autonomously acquire new skills for such challenging tasks is therefore highly desirable. The main challenge lies in developing adaptable and robust control algorithms that can generalize behaviors effectively, despite the complexity of the system. Reinforcement learning (RL) offers promising solutions by enabling agents to learn behaviors through interaction with their environments. Among RL methods, policy search algorithms can scale to high-dimensional systems and provide effective solutions. However, learning complex policies with numerous parameters often demands extensive sampling and is prone to poor local optima. This thesis aims to present a reinforcement learning policy search algorithm designed to accelerate the learning process for contact-rich manipulation tasks by incorporating task demonstrations. The focus is on integrating demonstrations into the guided policy search framework, enabling faster learning while avoiding suboptimal solutions.
In modern industries, robots are widely used for repetitive tasks, with their performance largely dependent on the precision of their controllers to follow pre-defined trajectories. However, their ability to operate autonomously in complex, unstructured environments, particularly in tasks involving contact-rich manipulation, remains limited. Enhancing robots with the capability to autonomously acquire new skills for such challenging tasks is therefore highly desirable. The main challenge lies in developing adaptable and robust control algorithms that can generalize behaviors effectively, despite the complexity of the system. Reinforcement learning (RL) offers promising solutions by enabling agents to learn behaviors through interaction with their environments. Among RL methods, policy search algorithms can scale to high-dimensional systems and provide effective solutions. However, learning complex policies with numerous parameters often demands extensive sampling and is prone to poor local optima. This thesis aims to present a reinforcement learning policy search algorithm designed to accelerate the learning process for contact-rich manipulation tasks by incorporating task demonstrations. The focus is on integrating demonstrations into the guided policy search framework, enabling faster learning while avoiding suboptimal solutions.
Speeding up Reinforcement Learning Algorithms Through Demonstrations: Application to Robotic Manipulation
PANIZZO, MARCO
2024/2025
Abstract
In modern industries, robots are widely used for repetitive tasks, with their performance largely dependent on the precision of their controllers to follow pre-defined trajectories. However, their ability to operate autonomously in complex, unstructured environments, particularly in tasks involving contact-rich manipulation, remains limited. Enhancing robots with the capability to autonomously acquire new skills for such challenging tasks is therefore highly desirable. The main challenge lies in developing adaptable and robust control algorithms that can generalize behaviors effectively, despite the complexity of the system. Reinforcement learning (RL) offers promising solutions by enabling agents to learn behaviors through interaction with their environments. Among RL methods, policy search algorithms can scale to high-dimensional systems and provide effective solutions. However, learning complex policies with numerous parameters often demands extensive sampling and is prone to poor local optima. This thesis aims to present a reinforcement learning policy search algorithm designed to accelerate the learning process for contact-rich manipulation tasks by incorporating task demonstrations. The focus is on integrating demonstrations into the guided policy search framework, enabling faster learning while avoiding suboptimal solutions.File | Dimensione | Formato | |
---|---|---|---|
Panizzo_Marco.pdf
accesso aperto
Dimensione
6.99 MB
Formato
Adobe PDF
|
6.99 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/81950