Robot Learning Techniques Based on Large Language Models and NLP: A Survey

This thesis presents an in-depth investigation into the integration of Large Language Models (LLMs), Natural Language Processing (NLP), and Transformer architectures within robotic systems. Motivated by the rapid evolution of artificial intelligence and the increasing complexity of real-world robotic applications, the research explores how these advanced models can effectively address longstanding challenges such as language ambiguity, seamless robotic control integration, and ensuring operational safety in dynamic environments. The study begins by laying a strong theoretical foundation, introducing the core principles behind LLMs, NLP, and Transformers, and detailing their historical development and evolution. Special emphasis is placed on key components such as self-attention and multi-head attention mechanisms, which are critical in enabling these models to process complex sequences of data and generate contextually coherent outputs. Building on this theoretical framework, the thesis conducts a systematic review of the stateof-the-art technologies underpinning these models. It meticulously examines how recent innovations in transformer-based architectures have been adapted for various robotic tasks, including grasp detection, human–robot interaction, and multi-modal data processing. The work also critically addresses challenges related to resource efficiency and model interpretability, proposing methods to mitigate computational overhead while enhancing the clarity of decision-making processes within AI systems. Furthermore, the research is enriched by extensive experimental case studies and a broad theoretical review of recent publications. These case studies provide practical evidence of how the integration of LLMs and Transformer models can lead to improved performance in robotic systems, demonstrating applications in areas such as automated grasping and natural languagedriven human–robot collaboration. By consolidating current methods and outcomes, this thesis not only charts the current landscape but also offers valuable insights and directions for future research aimed at advancing robotic intelligence and fostering more intuitive, effective interactions between humans and robots.

Robot Learning Techniques Based on Large Language Models and NLP: A Survey

SATAPATHY, ANASUYA

2024/2025

Abstract

This thesis presents an in-depth investigation into the integration of Large Language Models (LLMs), Natural Language Processing (NLP), and Transformer architectures within robotic systems. Motivated by the rapid evolution of artificial intelligence and the increasing complexity of real-world robotic applications, the research explores how these advanced models can effectively address longstanding challenges such as language ambiguity, seamless robotic control integration, and ensuring operational safety in dynamic environments. The study begins by laying a strong theoretical foundation, introducing the core principles behind LLMs, NLP, and Transformers, and detailing their historical development and evolution. Special emphasis is placed on key components such as self-attention and multi-head attention mechanisms, which are critical in enabling these models to process complex sequences of data and generate contextually coherent outputs. Building on this theoretical framework, the thesis conducts a systematic review of the stateof-the-art technologies underpinning these models. It meticulously examines how recent innovations in transformer-based architectures have been adapted for various robotic tasks, including grasp detection, human–robot interaction, and multi-modal data processing. The work also critically addresses challenges related to resource efficiency and model interpretability, proposing methods to mitigate computational overhead while enhancing the clarity of decision-making processes within AI systems. Furthermore, the research is enriched by extensive experimental case studies and a broad theoretical review of recent publications. These case studies provide practical evidence of how the integration of LLMs and Transformer models can lead to improved performance in robotic systems, demonstrating applications in areas such as automated grasping and natural languagedriven human–robot collaboration. By consolidating current methods and outcomes, this thesis not only charts the current landscape but also offers valuable insights and directions for future research aimed at advancing robotic intelligence and fostering more intuitive, effective interactions between humans and robots.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				INGEGNERIA DELL'INFORMAZIONE Laurea di Primo Livello (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Robot Learning Techniques Based on Large Language Models and NLP: A Survey
			
	Abstract in italiano
	
				This thesis presents an in-depth investigation into the integration of Large Language Models
(LLMs), Natural Language Processing (NLP), and Transformer architectures within robotic systems. Motivated by the rapid evolution of artificial intelligence and the increasing complexity
of real-world robotic applications, the research explores how these advanced models can effectively address longstanding challenges such as language ambiguity, seamless robotic control
integration, and ensuring operational safety in dynamic environments.
The study begins by laying a strong theoretical foundation, introducing the core principles
behind LLMs, NLP, and Transformers, and detailing their historical development and evolution.
Special emphasis is placed on key components such as self-attention and multi-head attention
mechanisms, which are critical in enabling these models to process complex sequences of data
and generate contextually coherent outputs.
Building on this theoretical framework, the thesis conducts a systematic review of the stateof-the-art technologies underpinning these models. It meticulously examines how recent innovations in transformer-based architectures have been adapted for various robotic tasks, including
grasp detection, human–robot interaction, and multi-modal data processing. The work also critically addresses challenges related to resource efficiency and model interpretability, proposing
methods to mitigate computational overhead while enhancing the clarity of decision-making
processes within AI systems.
Furthermore, the research is enriched by extensive experimental case studies and a broad
theoretical review of recent publications. These case studies provide practical evidence of how
the integration of LLMs and Transformer models can lead to improved performance in robotic
systems, demonstrating applications in areas such as automated grasping and natural languagedriven human–robot collaboration. By consolidating current methods and outcomes, this thesis
not only charts the current landscape but also offers valuable insights and directions for future
research aimed at advancing robotic intelligence and fostering more intuitive, effective interactions between humans and robots.
			
	Parola chiave
	
				Large Language Model
Transformers
NLP
Robotics
Langchain
			
	Relatore
	
				FALCO, PIETRO
			
	Appare nelle tipologie:
	
				Lauree triennali

File in questo prodotto:

File	Dimensione	Formato
Satapathy_Anasuya.pdf accesso aperto Dimensione 3.5 MB Formato Adobe PDF Visualizza/Apri	3.5 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/82606