A review on Graph RAG methods

Large Language Models (LLMs), built upon the transformer architecture and capable of few-shot learning, are nevertheless constrained by their static, pre-trained knowledge and a propensity for hallucination. Retrieval-Augmented Generation (RAG) addresses these issues by grounding generation in external data. However, standard ``Naive RAG'' frameworks, often relying on dense passage retrieval, sequence-to-sequence denoising, and vector stores like Chroma, fail when handling complex, relational data, as they treat knowledge as an unstructured ``bag of chunks.'' To resolve this, a new paradigm reviewed by, Graph RAG, has emerged, transforming source corpora into structured knowledge graphs. While advanced frameworks like ``From Local to Global'' (GraphRAG), StructRAG, and LightRAG claim superiority over retrieval-less approaches like HOLMES or graph optimization methods like REANO, they have been developed in isolation. This creates a critical research gap, leaving no clear, comparative data for practitioners. This thesis bridges this gap by presenting a rigorous benchmark analysis of these leading Graph RAG frameworks using the Gemini 2.0 and 2.5 flash-lite models as the underlying reasoning engines. They are evaluated against Naive RAG and a Simple LLM baseline across two distinct datasets: UltraDomainCS and NarrativeQA. Performance is measured using a broad range of metrics, including quality via LLM-as-a-Judge and BERTScore, cost, latency, and robustness. Our findings quantify the limitations of unstructured retrieval and the specific trade-offs of graph-based approaches. On technical data, Naive RAG proved detrimental (43.82\%), inferior to the Simple LLM baseline (49.20\%). In contrast, LightRAG emerged as the optimal solution (92.75\%) while consuming approximately 8x fewer tokens than frameworks like StructRAG. Conversely, on narrative data, StructRAG was the only robust performer, uniquely capable of bridging long-distance relationships, though at high cost. Finally, while GraphRAG provided high-quality answers, its latency rendered it non-viable for real-time applications. This analysis establishes the first data-driven trade-off for Graph RAG, demonstrating that the choice of framework must be optimized for the specific density and cost constraints of the domain.

A review on Graph RAG methods

AKBARI, NILA

2024/2025

Abstract

Large Language Models (LLMs), built upon the transformer architecture and capable of few-shot learning, are nevertheless constrained by their static, pre-trained knowledge and a propensity for hallucination. Retrieval-Augmented Generation (RAG) addresses these issues by grounding generation in external data. However, standard ``Naive RAG'' frameworks, often relying on dense passage retrieval, sequence-to-sequence denoising, and vector stores like Chroma, fail when handling complex, relational data, as they treat knowledge as an unstructured ``bag of chunks.'' To resolve this, a new paradigm reviewed by, Graph RAG, has emerged, transforming source corpora into structured knowledge graphs. While advanced frameworks like ``From Local to Global'' (GraphRAG), StructRAG, and LightRAG claim superiority over retrieval-less approaches like HOLMES or graph optimization methods like REANO, they have been developed in isolation. This creates a critical research gap, leaving no clear, comparative data for practitioners. This thesis bridges this gap by presenting a rigorous benchmark analysis of these leading Graph RAG frameworks using the Gemini 2.0 and 2.5 flash-lite models as the underlying reasoning engines. They are evaluated against Naive RAG and a Simple LLM baseline across two distinct datasets: UltraDomainCS and NarrativeQA. Performance is measured using a broad range of metrics, including quality via LLM-as-a-Judge and BERTScore, cost, latency, and robustness. Our findings quantify the limitations of unstructured retrieval and the specific trade-offs of graph-based approaches. On technical data, Naive RAG proved detrimental (43.82\%), inferior to the Simple LLM baseline (49.20\%). In contrast, LightRAG emerged as the optimal solution (92.75\%) while consuming approximately 8x fewer tokens than frameworks like StructRAG. Conversely, on narrative data, StructRAG was the only robust performer, uniquely capable of bridging long-distance relationships, though at high cost. Finally, while GraphRAG provided high-quality answers, its latency rendered it non-viable for real-time applications. This analysis establishes the first data-driven trade-off for Graph RAG, demonstrating that the choice of framework must be optimized for the specific density and cost constraints of the domain.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				COMPUTER SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				A review on Graph RAG methods
			
	Abstract in italiano
	
				Large Language Models (LLMs), built upon the transformer architecture and capable of few-shot learning, are nevertheless constrained by their static, pre-trained knowledge and a propensity for hallucination. Retrieval-Augmented Generation (RAG) addresses these issues by grounding generation in external data. However, standard ``Naive RAG'' frameworks, often relying on dense passage retrieval, sequence-to-sequence denoising, and vector stores like Chroma, fail when handling complex, relational data, as they treat knowledge as an unstructured ``bag of chunks.'' 

To resolve this, a new paradigm reviewed by, Graph RAG, has emerged, transforming source corpora into structured knowledge graphs. While advanced frameworks like ``From Local to Global'' (GraphRAG), StructRAG, and LightRAG claim superiority over retrieval-less approaches like HOLMES or graph optimization methods like REANO, they have been developed in isolation. This creates a critical research gap, leaving no clear, comparative data for practitioners.

This thesis bridges this gap by presenting a rigorous benchmark analysis of these leading Graph RAG frameworks using the Gemini 2.0 and 2.5 flash-lite models as the underlying reasoning engines. They are evaluated against Naive RAG and a Simple LLM baseline across two distinct datasets: UltraDomainCS and NarrativeQA. Performance is measured using a broad range of metrics, including quality via LLM-as-a-Judge and BERTScore, cost, latency, and robustness.

Our findings quantify the limitations of unstructured retrieval and the specific trade-offs of graph-based approaches. On technical data, Naive RAG proved detrimental (43.82\%), inferior to the Simple LLM baseline (49.20\%). In contrast, LightRAG emerged as the optimal solution (92.75\%) while consuming approximately 8x fewer tokens than frameworks like StructRAG. Conversely, on narrative data, StructRAG was the only robust performer, uniquely capable of bridging long-distance relationships, though at high cost. Finally, while GraphRAG provided high-quality answers, its latency rendered it non-viable for real-time applications. This analysis establishes the first data-driven trade-off for Graph RAG, demonstrating that the choice of framework must be optimized for the specific density and cost constraints of the domain.
			
	Parola chiave
	
				NLP
RAG
LLM
			
	Relatore
	
				NAVARIN, NICOLO'
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Thesis_NilaAkbari.pdf accesso aperto Dimensione 1.93 MB Formato Adobe PDF Visualizza/Apri	1.93 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/102077