This thesis presents the design of a Retrieval-Augmented Generation (RAG) system and proposes an evaluation strategy to assess its performance in open-domain Question Answering (QA). The study explores how different settings, including prompt design, deduplication strategies, generator model temperature, and context document order, impact the quality of generated responses. Performance is measured through metrics that assess answer correctness, fluency, and citation quality. The experiments, conducted as part of the TREC 2024 RAG track, reveal significant trade-offs between these factors. Longer prompts improved fluency and correctness but negatively affected citation quality, while deduplication strategies often led to the loss of useful context, diminishing the overall answer quality. Changes in generator temperature and context document order had minimal impact on the results. The proposed evaluation strategy provides a structured approach for assessing the system’s performance and enables effective comparison across the different settings. This work also discusses limitations, such as the evaluation method’s inability to penalize unnecessary citations, and the computational inefficiencies of deduplication. Future research should focus on alternative evaluation metrics, efficient retrieval systems, and improved citation strategies.
This thesis presents the design of a Retrieval-Augmented Generation (RAG) system and proposes an evaluation strategy to assess its performance in open-domain Question Answering (QA). The study explores how different settings, including prompt design, deduplication strategies, generator model temperature, and context document order, impact the quality of generated responses. Performance is measured through metrics that assess answer correctness, fluency, and citation quality. The experiments, conducted as part of the TREC 2024 RAG track, reveal significant trade-offs between these factors. Longer prompts improved fluency and correctness but negatively affected citation quality, while deduplication strategies often led to the loss of useful context, diminishing the overall answer quality. Changes in generator temperature and context document order had minimal impact on the results. The proposed evaluation strategy provides a structured approach for assessing the system’s performance and enables effective comparison across the different settings. This work also discusses limitations, such as the evaluation method’s inability to penalize unnecessary citations, and the computational inefficiencies of deduplication. Future research should focus on alternative evaluation metrics, efficient retrieval systems, and improved citation strategies.
Retrieval-Augmented Generation: Strengthening Answer Confidence through Source Referencing
CECCATO, ANDREA
2023/2024
Abstract
This thesis presents the design of a Retrieval-Augmented Generation (RAG) system and proposes an evaluation strategy to assess its performance in open-domain Question Answering (QA). The study explores how different settings, including prompt design, deduplication strategies, generator model temperature, and context document order, impact the quality of generated responses. Performance is measured through metrics that assess answer correctness, fluency, and citation quality. The experiments, conducted as part of the TREC 2024 RAG track, reveal significant trade-offs between these factors. Longer prompts improved fluency and correctness but negatively affected citation quality, while deduplication strategies often led to the loss of useful context, diminishing the overall answer quality. Changes in generator temperature and context document order had minimal impact on the results. The proposed evaluation strategy provides a structured approach for assessing the system’s performance and enables effective comparison across the different settings. This work also discusses limitations, such as the evaluation method’s inability to penalize unnecessary citations, and the computational inefficiencies of deduplication. Future research should focus on alternative evaluation metrics, efficient retrieval systems, and improved citation strategies.File | Dimensione | Formato | |
---|---|---|---|
Ceccato_Andrea.pdf
accesso riservato
Dimensione
2.28 MB
Formato
Adobe PDF
|
2.28 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/75155