This thesis explores the application of Prompt Optimizationwith Textual Gradients (ProTeGi), a novel approach to automatic prompt optimization for Large Language Models (LLMs), with a particular focus on Retrieval-Augmented Generation (RAG) systems. ProTeGi, which simulates gradient descent in natural language space, offers a systematic method for iteratively improving prompts through feedback and structured refinement. The research begins by reimplementing the original ProTeGi algorithm and testing it across multiple state-of-the-art models, including GPT-4o and GPT-4o-mini, to evaluate its effectiveness on modern LLMs. This comparative analysis across model architectures yielded valuable insights about how different LLMs respond to optimized prompts. Following this validation, we conducted an in-depth analysis of the linguistic and structural characteristics of successful prompts, identifying several key patterns not covered in previous research, including the optimal relationship between prompt length and effectiveness, the impact of specific organizational structures, and the influence of certain linguistic features on model performance. These theoretical findings are then applied to a real-world case study: the development of ”Rescue,” a RAG-based customer support system for a telecommunications company. By incorporating ProTeGi-derived insights into prompt engineering, the system achieved significant improvements in both response rate (increasing from 54% to 80% of queries receiving responses) and response quality (with acceptable responses improving from 80% to 90%). The case study demonstrates how structured prompt optimization can transform RAG system performance even with limited training data. The thesis contributes to the field by bridging theoretical prompt optimization research with practical RAG implementation challenges, providing actionable guidance for developing more effective AI-assisted information retrieval systems in specialized domains.

This thesis explores the application of Prompt Optimizationwith Textual Gradients (ProTeGi), a novel approach to automatic prompt optimization for Large Language Models (LLMs), with a particular focus on Retrieval-Augmented Generation (RAG) systems. ProTeGi, which simulates gradient descent in natural language space, offers a systematic method for iteratively improving prompts through feedback and structured refinement. The research begins by reimplementing the original ProTeGi algorithm and testing it across multiple state-of-the-art models, including GPT-4o and GPT-4o-mini, to evaluate its effectiveness on modern LLMs. This comparative analysis across model architectures yielded valuable insights about how different LLMs respond to optimized prompts. Following this validation, we conducted an in-depth analysis of the linguistic and structural characteristics of successful prompts, identifying several key patterns not covered in previous research, including the optimal relationship between prompt length and effectiveness, the impact of specific organizational structures, and the influence of certain linguistic features on model performance. These theoretical findings are then applied to a real-world case study: the development of ”Rescue,” a RAG-based customer support system for a telecommunications company. By incorporating ProTeGi-derived insights into prompt engineering, the system achieved significant improvements in both response rate (increasing from 54% to 80% of queries receiving responses) and response quality (with acceptable responses improving from 80% to 90%). The case study demonstrates how structured prompt optimization can transform RAG system performance even with limited training data. The thesis contributes to the field by bridging theoretical prompt optimization research with practical RAG implementation challenges, providing actionable guidance for developing more effective AI-assisted information retrieval systems in specialized domains.

Prompt Optimization with Textual Gradients (ProTeGi): Applications in Retrieval-Augmented Generation

PIVA, GIOVANNI
2024/2025

Abstract

This thesis explores the application of Prompt Optimizationwith Textual Gradients (ProTeGi), a novel approach to automatic prompt optimization for Large Language Models (LLMs), with a particular focus on Retrieval-Augmented Generation (RAG) systems. ProTeGi, which simulates gradient descent in natural language space, offers a systematic method for iteratively improving prompts through feedback and structured refinement. The research begins by reimplementing the original ProTeGi algorithm and testing it across multiple state-of-the-art models, including GPT-4o and GPT-4o-mini, to evaluate its effectiveness on modern LLMs. This comparative analysis across model architectures yielded valuable insights about how different LLMs respond to optimized prompts. Following this validation, we conducted an in-depth analysis of the linguistic and structural characteristics of successful prompts, identifying several key patterns not covered in previous research, including the optimal relationship between prompt length and effectiveness, the impact of specific organizational structures, and the influence of certain linguistic features on model performance. These theoretical findings are then applied to a real-world case study: the development of ”Rescue,” a RAG-based customer support system for a telecommunications company. By incorporating ProTeGi-derived insights into prompt engineering, the system achieved significant improvements in both response rate (increasing from 54% to 80% of queries receiving responses) and response quality (with acceptable responses improving from 80% to 90%). The case study demonstrates how structured prompt optimization can transform RAG system performance even with limited training data. The thesis contributes to the field by bridging theoretical prompt optimization research with practical RAG implementation challenges, providing actionable guidance for developing more effective AI-assisted information retrieval systems in specialized domains.
2024
Prompt Optimization with Textual Gradients (ProTeGi): Applications in Retrieval-Augmented Generation
This thesis explores the application of Prompt Optimizationwith Textual Gradients (ProTeGi), a novel approach to automatic prompt optimization for Large Language Models (LLMs), with a particular focus on Retrieval-Augmented Generation (RAG) systems. ProTeGi, which simulates gradient descent in natural language space, offers a systematic method for iteratively improving prompts through feedback and structured refinement. The research begins by reimplementing the original ProTeGi algorithm and testing it across multiple state-of-the-art models, including GPT-4o and GPT-4o-mini, to evaluate its effectiveness on modern LLMs. This comparative analysis across model architectures yielded valuable insights about how different LLMs respond to optimized prompts. Following this validation, we conducted an in-depth analysis of the linguistic and structural characteristics of successful prompts, identifying several key patterns not covered in previous research, including the optimal relationship between prompt length and effectiveness, the impact of specific organizational structures, and the influence of certain linguistic features on model performance. These theoretical findings are then applied to a real-world case study: the development of ”Rescue,” a RAG-based customer support system for a telecommunications company. By incorporating ProTeGi-derived insights into prompt engineering, the system achieved significant improvements in both response rate (increasing from 54% to 80% of queries receiving responses) and response quality (with acceptable responses improving from 80% to 90%). The case study demonstrates how structured prompt optimization can transform RAG system performance even with limited training data. The thesis contributes to the field by bridging theoretical prompt optimization research with practical RAG implementation challenges, providing actionable guidance for developing more effective AI-assisted information retrieval systems in specialized domains.
LLM
Prompt
GPT
Gradient Descent
ProTeGi
File in questo prodotto:
File Dimensione Formato  
Final_thesis_Piva_PDFa.pdf

Accesso riservato

Dimensione 1.02 MB
Formato Adobe PDF
1.02 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/84790