This thesis presents an in-depth investigation into the integration of Large Language Models (LLMs), Natural Language Processing (NLP), and Transformer architectures within robotic systems. Motivated by the rapid evolution of artificial intelligence and the increasing complexity of real-world robotic applications, the research explores how these advanced models can effectively address longstanding challenges such as language ambiguity, seamless robotic control integration, and ensuring operational safety in dynamic environments. The study begins by laying a strong theoretical foundation, introducing the core principles behind LLMs, NLP, and Transformers, and detailing their historical development and evolution. Special emphasis is placed on key components such as self-attention and multi-head attention mechanisms, which are critical in enabling these models to process complex sequences of data and generate contextually coherent outputs. Building on this theoretical framework, the thesis conducts a systematic review of the stateof-the-art technologies underpinning these models. It meticulously examines how recent innovations in transformer-based architectures have been adapted for various robotic tasks, including grasp detection, human–robot interaction, and multi-modal data processing. The work also critically addresses challenges related to resource efficiency and model interpretability, proposing methods to mitigate computational overhead while enhancing the clarity of decision-making processes within AI systems. Furthermore, the research is enriched by extensive experimental case studies and a broad theoretical review of recent publications. These case studies provide practical evidence of how the integration of LLMs and Transformer models can lead to improved performance in robotic systems, demonstrating applications in areas such as automated grasping and natural languagedriven human–robot collaboration. By consolidating current methods and outcomes, this thesis not only charts the current landscape but also offers valuable insights and directions for future research aimed at advancing robotic intelligence and fostering more intuitive, effective interactions between humans and robots.

This thesis presents an in-depth investigation into the integration of Large Language Models (LLMs), Natural Language Processing (NLP), and Transformer architectures within robotic systems. Motivated by the rapid evolution of artificial intelligence and the increasing complexity of real-world robotic applications, the research explores how these advanced models can effectively address longstanding challenges such as language ambiguity, seamless robotic control integration, and ensuring operational safety in dynamic environments. The study begins by laying a strong theoretical foundation, introducing the core principles behind LLMs, NLP, and Transformers, and detailing their historical development and evolution. Special emphasis is placed on key components such as self-attention and multi-head attention mechanisms, which are critical in enabling these models to process complex sequences of data and generate contextually coherent outputs. Building on this theoretical framework, the thesis conducts a systematic review of the stateof-the-art technologies underpinning these models. It meticulously examines how recent innovations in transformer-based architectures have been adapted for various robotic tasks, including grasp detection, human–robot interaction, and multi-modal data processing. The work also critically addresses challenges related to resource efficiency and model interpretability, proposing methods to mitigate computational overhead while enhancing the clarity of decision-making processes within AI systems. Furthermore, the research is enriched by extensive experimental case studies and a broad theoretical review of recent publications. These case studies provide practical evidence of how the integration of LLMs and Transformer models can lead to improved performance in robotic systems, demonstrating applications in areas such as automated grasping and natural languagedriven human–robot collaboration. By consolidating current methods and outcomes, this thesis not only charts the current landscape but also offers valuable insights and directions for future research aimed at advancing robotic intelligence and fostering more intuitive, effective interactions between humans and robots.

Robot Learning Techniques Based on Large Language Models and NLP: A Survey

SATAPATHY, ANASUYA
2024/2025

Abstract

This thesis presents an in-depth investigation into the integration of Large Language Models (LLMs), Natural Language Processing (NLP), and Transformer architectures within robotic systems. Motivated by the rapid evolution of artificial intelligence and the increasing complexity of real-world robotic applications, the research explores how these advanced models can effectively address longstanding challenges such as language ambiguity, seamless robotic control integration, and ensuring operational safety in dynamic environments. The study begins by laying a strong theoretical foundation, introducing the core principles behind LLMs, NLP, and Transformers, and detailing their historical development and evolution. Special emphasis is placed on key components such as self-attention and multi-head attention mechanisms, which are critical in enabling these models to process complex sequences of data and generate contextually coherent outputs. Building on this theoretical framework, the thesis conducts a systematic review of the stateof-the-art technologies underpinning these models. It meticulously examines how recent innovations in transformer-based architectures have been adapted for various robotic tasks, including grasp detection, human–robot interaction, and multi-modal data processing. The work also critically addresses challenges related to resource efficiency and model interpretability, proposing methods to mitigate computational overhead while enhancing the clarity of decision-making processes within AI systems. Furthermore, the research is enriched by extensive experimental case studies and a broad theoretical review of recent publications. These case studies provide practical evidence of how the integration of LLMs and Transformer models can lead to improved performance in robotic systems, demonstrating applications in areas such as automated grasping and natural languagedriven human–robot collaboration. By consolidating current methods and outcomes, this thesis not only charts the current landscape but also offers valuable insights and directions for future research aimed at advancing robotic intelligence and fostering more intuitive, effective interactions between humans and robots.
2024
Robot Learning Techniques Based on Large Language Models and NLP: A Survey
This thesis presents an in-depth investigation into the integration of Large Language Models (LLMs), Natural Language Processing (NLP), and Transformer architectures within robotic systems. Motivated by the rapid evolution of artificial intelligence and the increasing complexity of real-world robotic applications, the research explores how these advanced models can effectively address longstanding challenges such as language ambiguity, seamless robotic control integration, and ensuring operational safety in dynamic environments. The study begins by laying a strong theoretical foundation, introducing the core principles behind LLMs, NLP, and Transformers, and detailing their historical development and evolution. Special emphasis is placed on key components such as self-attention and multi-head attention mechanisms, which are critical in enabling these models to process complex sequences of data and generate contextually coherent outputs. Building on this theoretical framework, the thesis conducts a systematic review of the stateof-the-art technologies underpinning these models. It meticulously examines how recent innovations in transformer-based architectures have been adapted for various robotic tasks, including grasp detection, human–robot interaction, and multi-modal data processing. The work also critically addresses challenges related to resource efficiency and model interpretability, proposing methods to mitigate computational overhead while enhancing the clarity of decision-making processes within AI systems. Furthermore, the research is enriched by extensive experimental case studies and a broad theoretical review of recent publications. These case studies provide practical evidence of how the integration of LLMs and Transformer models can lead to improved performance in robotic systems, demonstrating applications in areas such as automated grasping and natural languagedriven human–robot collaboration. By consolidating current methods and outcomes, this thesis not only charts the current landscape but also offers valuable insights and directions for future research aimed at advancing robotic intelligence and fostering more intuitive, effective interactions between humans and robots.
Large Language Model
Transformers
NLP
Robotics
Langchain
File in questo prodotto:
File Dimensione Formato  
Satapathy_Anasuya.pdf

accesso aperto

Dimensione 3.5 MB
Formato Adobe PDF
3.5 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/82606