This Master’s thesis explores the effectiveness of state-of-the-art Large Language Mod- els (LLMs) in searching for scholars across various fields of physics. The study focuses on auditing the models capabilities in different search-related tasks with a special emphasis on minority representation. These tasks include top-k endorsements, epoch-specific suggestions, field-based recommendations, seniority-related and statistical twins. The thesis discusses the models' strengths, limitations, and potential biases, providing insights for future improvements in academic author search engines. Through rigorous experimentation and data analysis, this work contributes to the broader understanding of LLMs’ application in academic contexts and addresses critical issues of fairness and representation in AI-driven scholarly tools.
AUDITING OPEN-SOURCE LLMs FOR ACADEMIC AUTHOR SEARCH
BAROLO, DANIELE
2023/2024
Abstract
This Master’s thesis explores the effectiveness of state-of-the-art Large Language Mod- els (LLMs) in searching for scholars across various fields of physics. The study focuses on auditing the models capabilities in different search-related tasks with a special emphasis on minority representation. These tasks include top-k endorsements, epoch-specific suggestions, field-based recommendations, seniority-related and statistical twins. The thesis discusses the models' strengths, limitations, and potential biases, providing insights for future improvements in academic author search engines. Through rigorous experimentation and data analysis, this work contributes to the broader understanding of LLMs’ application in academic contexts and addresses critical issues of fairness and representation in AI-driven scholarly tools.File | Dimensione | Formato | |
---|---|---|---|
Final_Draft_Daniele_Barolo_Data_Science_MsC_Thesis_UniPD.pdf
accesso aperto
Dimensione
1.11 MB
Formato
Adobe PDF
|
1.11 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/80879