Experimental Study on Retrieval-Augmented Generation:  Engineering and Evaluation of a Custom RAG system for Open-Domain QA

This thesis presents the design, implementation, and evaluation of a custom Retrieval-Augmented Generation (RAG) system for Open-Domain Question Answering (QA). The work bases its foundations from the TREC RAG 2024 Track and focuses on combining both traditional and innovative information retrieval methods with local large language models (LLMs) to build a flexible and efficient end-to-end pipeline. The retrieval component is based on the Pyserini framework, using datasets and evaluation data and tools provided by TREC (e.g. MS MARCO Segment v2.1 collection): various retrieval strategies were explored, including BM25, query expansion, pseudo-relevance feedback, and re-ranking techniques. For the generation component, multiple local LLMs were tested under different prompting strategies and configurations, with particular attention to performance optimization through quantization, GPU acceleration, and fine-tuning. Results were then compared with outputs from state-of-the-art hosted LLMs to assess relative quality and performance. Additionally, a preliminary experiment with a Parametric RAG approach (PRAG), a new approach to RAG presented in a recently published paper (January 2025) where context is integrated as model parameters instead of prompt inputs (or both), is introduced. The results highlight how different combinations of retrieval and generation techniques impact the relevance and quality of the final answers: this experimental study contributes to the practical understanding of building customized, efficient, and interpretable RAG systems using open-source tools and local models.

Experimental Study on Retrieval-Augmented Generation: Engineering and Evaluation of a Custom RAG system for Open-Domain QA

ANTOLINI, GIANLUCA

2024/2025

Abstract

This thesis presents the design, implementation, and evaluation of a custom Retrieval-Augmented Generation (RAG) system for Open-Domain Question Answering (QA). The work bases its foundations from the TREC RAG 2024 Track and focuses on combining both traditional and innovative information retrieval methods with local large language models (LLMs) to build a flexible and efficient end-to-end pipeline. The retrieval component is based on the Pyserini framework, using datasets and evaluation data and tools provided by TREC (e.g. MS MARCO Segment v2.1 collection): various retrieval strategies were explored, including BM25, query expansion, pseudo-relevance feedback, and re-ranking techniques. For the generation component, multiple local LLMs were tested under different prompting strategies and configurations, with particular attention to performance optimization through quantization, GPU acceleration, and fine-tuning. Results were then compared with outputs from state-of-the-art hosted LLMs to assess relative quality and performance. Additionally, a preliminary experiment with a Parametric RAG approach (PRAG), a new approach to RAG presented in a recently published paper (January 2025) where context is integrated as model parameters instead of prompt inputs (or both), is introduced. The results highlight how different combinations of retrieval and generation techniques impact the relevance and quality of the final answers: this experimental study contributes to the practical understanding of building customized, efficient, and interpretable RAG systems using open-source tools and local models.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				COMPUTER ENGINEERING Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Experimental Study on Retrieval-Augmented Generation:  Engineering and Evaluation of a Custom RAG system for Open-Domain QA
			
	Parola chiave
	
				RAG
IR
LLM
			
	Relatore
	
				FERRO, NICOLA
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
main.pdf accesso aperto Dimensione 4.63 MB Formato Adobe PDF Visualizza/Apri	4.63 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/86949