In this thesis, we investigate how to integrate reinforcement learning in a shared intelligence system where the user's commands are equally fused with the robot's perception during the teleoperation of mobile robots. Specifically, we illustrate a new policy-based implementation suitable for navigating in unknown indoor environment with the ability to avoid collisions with dynamic and static obstacles. The aim of this thesis consists of extending the current system of shared intelligence based on numerous pre-defined policies with a new one based on Reinforcement Learning (RL). To design this policy, an agent learns to reach a predefined goal by multiple trial and error interactions. To make the robot learn correct actions, a reward function is defined taking inspiration from the Attractive Potential Field (APF) and the state is computed through a pre-processing module that clusters the obstacles around the robot and finds the closest point with respect to the robot. Different clustering algorithms are analysed for establishing which of them is the most suitable for the purpose considering the real-time constraints required by the system. Different model configurations are examined and trained in gazebo-based simulation scenarios and then evaluated in different navigation scenarios. In this way, we verified the reactive navigation behaviour of the agent with static and dynamic obstacles. The shared system combined with the new RL policy is tested and compared with the current state-of-the-art version, in a dedicated teleoperated experiment where an operator was required to interact with the robot by delivering high-level commands.
Reinforcement learning in shared intelligence systems for mobile robots
MIHAILOVIC, MIROLJUB
2021/2022
Abstract
In this thesis, we investigate how to integrate reinforcement learning in a shared intelligence system where the user's commands are equally fused with the robot's perception during the teleoperation of mobile robots. Specifically, we illustrate a new policy-based implementation suitable for navigating in unknown indoor environment with the ability to avoid collisions with dynamic and static obstacles. The aim of this thesis consists of extending the current system of shared intelligence based on numerous pre-defined policies with a new one based on Reinforcement Learning (RL). To design this policy, an agent learns to reach a predefined goal by multiple trial and error interactions. To make the robot learn correct actions, a reward function is defined taking inspiration from the Attractive Potential Field (APF) and the state is computed through a pre-processing module that clusters the obstacles around the robot and finds the closest point with respect to the robot. Different clustering algorithms are analysed for establishing which of them is the most suitable for the purpose considering the real-time constraints required by the system. Different model configurations are examined and trained in gazebo-based simulation scenarios and then evaluated in different navigation scenarios. In this way, we verified the reactive navigation behaviour of the agent with static and dynamic obstacles. The shared system combined with the new RL policy is tested and compared with the current state-of-the-art version, in a dedicated teleoperated experiment where an operator was required to interact with the robot by delivering high-level commands.File | Dimensione | Formato | |
---|---|---|---|
Mihailovic_Miroljub.pdf
accesso riservato
Dimensione
4.56 MB
Formato
Adobe PDF
|
4.56 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/30829