Prompt Optimization with Textual Gradients (ProTeGi): Applications in Retrieval-Augmented Generation

This thesis explores the application of Prompt Optimizationwith Textual Gradients (ProTeGi), a novel approach to automatic prompt optimization for Large Language Models (LLMs), with a particular focus on Retrieval-Augmented Generation (RAG) systems. ProTeGi, which simulates gradient descent in natural language space, offers a systematic method for iteratively improving prompts through feedback and structured refinement. The research begins by reimplementing the original ProTeGi algorithm and testing it across multiple state-of-the-art models, including GPT-4o and GPT-4o-mini, to evaluate its effectiveness on modern LLMs. This comparative analysis across model architectures yielded valuable insights about how different LLMs respond to optimized prompts. Following this validation, we conducted an in-depth analysis of the linguistic and structural characteristics of successful prompts, identifying several key patterns not covered in previous research, including the optimal relationship between prompt length and effectiveness, the impact of specific organizational structures, and the influence of certain linguistic features on model performance. These theoretical findings are then applied to a real-world case study: the development of ”Rescue,” a RAG-based customer support system for a telecommunications company. By incorporating ProTeGi-derived insights into prompt engineering, the system achieved significant improvements in both response rate (increasing from 54% to 80% of queries receiving responses) and response quality (with acceptable responses improving from 80% to 90%). The case study demonstrates how structured prompt optimization can transform RAG system performance even with limited training data. The thesis contributes to the field by bridging theoretical prompt optimization research with practical RAG implementation challenges, providing actionable guidance for developing more effective AI-assisted information retrieval systems in specialized domains.

Prompt Optimization with Textual Gradients (ProTeGi): Applications in Retrieval-Augmented Generation

PIVA, GIOVANNI

2024/2025

Abstract

This thesis explores the application of Prompt Optimizationwith Textual Gradients (ProTeGi), a novel approach to automatic prompt optimization for Large Language Models (LLMs), with a particular focus on Retrieval-Augmented Generation (RAG) systems. ProTeGi, which simulates gradient descent in natural language space, offers a systematic method for iteratively improving prompts through feedback and structured refinement. The research begins by reimplementing the original ProTeGi algorithm and testing it across multiple state-of-the-art models, including GPT-4o and GPT-4o-mini, to evaluate its effectiveness on modern LLMs. This comparative analysis across model architectures yielded valuable insights about how different LLMs respond to optimized prompts. Following this validation, we conducted an in-depth analysis of the linguistic and structural characteristics of successful prompts, identifying several key patterns not covered in previous research, including the optimal relationship between prompt length and effectiveness, the impact of specific organizational structures, and the influence of certain linguistic features on model performance. These theoretical findings are then applied to a real-world case study: the development of ”Rescue,” a RAG-based customer support system for a telecommunications company. By incorporating ProTeGi-derived insights into prompt engineering, the system achieved significant improvements in both response rate (increasing from 54% to 80% of queries receiving responses) and response quality (with acceptable responses improving from 80% to 90%). The case study demonstrates how structured prompt optimization can transform RAG system performance even with limited training data. The thesis contributes to the field by bridging theoretical prompt optimization research with practical RAG implementation challenges, providing actionable guidance for developing more effective AI-assisted information retrieval systems in specialized domains.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Prompt Optimization with Textual Gradients (ProTeGi): Applications in Retrieval-Augmented Generation
			
	Abstract in italiano
	
				This thesis explores the application of Prompt Optimizationwith Textual Gradients (ProTeGi), a novel approach to automatic prompt optimization for Large Language Models (LLMs), with a particular focus on Retrieval-Augmented Generation (RAG) systems. ProTeGi, which simulates gradient descent in natural language space, offers a systematic method for iteratively improving prompts through feedback and structured refinement. The research begins by reimplementing the original ProTeGi algorithm and testing it across multiple state-of-the-art models, including GPT-4o and GPT-4o-mini, to evaluate its effectiveness on modern LLMs. This comparative analysis across model architectures yielded valuable insights about how different LLMs respond to optimized prompts. Following this validation, we conducted an in-depth analysis of the linguistic and structural characteristics of successful prompts, identifying several key patterns not covered in previous research, including the optimal relationship between prompt length and effectiveness, the impact of specific organizational structures, and the influence of certain linguistic features on model performance. These theoretical findings are then applied to a real-world case study: the development of ”Rescue,” a RAG-based customer support system for a telecommunications company. By incorporating ProTeGi-derived insights into prompt engineering, the system achieved significant improvements in both response rate (increasing from 54% to 80% of queries receiving responses) and response quality (with acceptable responses improving from 80% to 90%). The case study demonstrates how structured prompt optimization can transform RAG system performance even with limited training data. The thesis contributes to the field by bridging theoretical prompt optimization research with practical RAG implementation challenges, providing actionable guidance for developing more effective AI-assisted information retrieval systems in specialized domains.
			
	Parola chiave
	
				LLM
Prompt
GPT
Gradient Descent
ProTeGi
			
	Relatore
	
				RINALDI, FRANCESCO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Final_thesis_Piva_PDFa.pdf Accesso riservato Dimensione 1.02 MB Formato Adobe PDF	1.02 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/84790