Leveraging Large Language Models for Structured Summarization with Evaluation through Reference Matching

The rapid growth of digital information necessitates effective methods for condensing lengthy documents into coherent and concise summaries. This thesis explores the application of structured summarization techniques using large language models (LLMs) to enhance the efficiency and quality of summarizing long texts. The study focuses on leveraging advanced LLMs to generate structured summaries and evaluates their performance against human-written summaries using both qualitative and quantitative metrics. The research begins by selecting a corpus of long documents from various domains and generating structured summaries using state-of-the-art LLMs. These summaries are then compared to human-crafted summaries to assess their coherence, completeness, and accuracy. The evaluation framework incorporates statistical metrics such as ROUGE, BLEU, and METEOR to quantitatively measure the performance of the generated summaries. Additionally, a qualitative analysis is conducted through human evaluation, where experts assess the readability, relevance, and informativeness of the summaries. The results from both quantitative and qualitative assessments provide a comprehensive understanding of the strengths and limitations of LLM-based structured summarization. The findings of this thesis contribute to the advancement of summarization technologies and offer insights into improving the performance of LLMs in generating high-quality summaries. The study underscores the potential of structured summarization in managing the ever-increasing volume of information and highlights the importance of continuous refinement in language models to meet the evolving needs of information processing.

La rapida crescita delle informazioni digitali richiede metodi efficaci per condensare documenti lunghi in riassunti coerenti e concisi. Questa tesi esplora l'applicazione di tecniche di sintesi strutturata utilizzando grandi modelli linguistici (LLM) per migliorare l'efficienza e la qualità della sintesi di testi lunghi. Lo studio si concentra sull'utilizzo di LLM avanzati per generare riassunti strutturati e valuta le loro prestazioni rispetto ai riassunti scritti da esseri umani utilizzando metriche qualitative e quantitative. La ricerca inizia selezionando un corpus di documenti lunghi provenienti da vari domini e generando riassunti strutturati utilizzando LLM all'avanguardia. Questi riassunti vengono quindi confrontati con quelli redatti da esseri umani per valutarne la coerenza, completezza e accuratezza. Il quadro di valutazione incorpora metriche statistiche come ROUGE, BLEU e METEOR per misurare quantitativamente le prestazioni dei riassunti generati. Inoltre, viene condotta un'analisi qualitativa attraverso una valutazione umana, in cui esperti valutano la leggibilità, la rilevanza e l'informatività dei riassunti. I risultati delle valutazioni quantitative e qualitative forniscono una comprensione completa dei punti di forza e delle limitazioni della sintesi strutturata basata su LLM. I risultati di questa tesi contribuiscono all'avanzamento delle tecnologie di sintesi e offrono approfondimenti su come migliorare le prestazioni degli LLM nella generazione di riassunti di alta qualità. Lo studio sottolinea il potenziale della sintesi strutturata nella gestione dell'enorme volume di informazioni in costante crescita e evidenzia l'importanza del continuo perfezionamento dei modelli linguistici per soddisfare le esigenze evolutive dell'elaborazione delle informazioni.