Retrieval-Augmented Generation (RAG) has become a standard approach to address key challenges in large language models (LLMs), such as hallucinations and a lack of domain-specific knowledge. However, RAG systems face significant limitations when dealing with tasks that require multi-hop reasoning or holistic comprehension of textual content. These tasks often require to retrieve context from multiple sources or to develop a thorough understanding of a document, which poses challenges for the traditional semantic search methods used in standard RAG implementations. This thesis investigates two state-of-the-art methods, RAPTOR and GraphRAG, which address these challenges from distinct perspectives. Furthermore, it introduces two novel models that leverage entity and relationship extraction to enhance the retrieval step in RAG systems. One of these, the relationships-based model, achieved a 26\% improvement in accuracy over GraphRAG on the 2WikiMultiHopQA dataset. This result highlights the potential of graph-like structures to improve the effectiveness of RAG systems, paving the way for more robust solutions in multi-hop reasoning and complex retrieval tasks.
Hierarchical and Entity-Based Retrieval Augmented Generation
SANDRINELLI, FEDERICO
2023/2024
Abstract
Retrieval-Augmented Generation (RAG) has become a standard approach to address key challenges in large language models (LLMs), such as hallucinations and a lack of domain-specific knowledge. However, RAG systems face significant limitations when dealing with tasks that require multi-hop reasoning or holistic comprehension of textual content. These tasks often require to retrieve context from multiple sources or to develop a thorough understanding of a document, which poses challenges for the traditional semantic search methods used in standard RAG implementations. This thesis investigates two state-of-the-art methods, RAPTOR and GraphRAG, which address these challenges from distinct perspectives. Furthermore, it introduces two novel models that leverage entity and relationship extraction to enhance the retrieval step in RAG systems. One of these, the relationships-based model, achieved a 26\% improvement in accuracy over GraphRAG on the 2WikiMultiHopQA dataset. This result highlights the potential of graph-like structures to improve the effectiveness of RAG systems, paving the way for more robust solutions in multi-hop reasoning and complex retrieval tasks.File | Dimensione | Formato | |
---|---|---|---|
Sandrinelli_Federico.pdf
accesso riservato
Dimensione
1.21 MB
Formato
Adobe PDF
|
1.21 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/80902