Gender Bias in AI: Measuring and Debiasing Occupational Gender Representations in Large Language Models

Gender bias in artificial intelligence (AI), particularly in large language models (LLM) such as LLaMA 2, presents a significant challenge, as these models are trained on vast datasets that often encode and reflect societal biases and gender inequalities. This thesis explores how LLaMA 2 (Llama-2-7b-hf) and LLaMA 3 (Llama-3.2-1B) exhibit both explicit and implicit gender bias in occupational predictions using a prompt-based probability evaluation framework. The models were probed using occupation-based templates and responses were evaluated based on gendered pronoun probabilities. Explicit bias is measured through direct responses to gender-neutral prompts, while implicit bias is assessed through the distribution of gendered pronouns in more conversational contexts. The study uniquely incorporates both English and Italian datasets, with the latter providing additional insights due to the gendered nature of the language. To address the gender imbalances, zero-shot debiasing was applied via instructional prompts, achieving significantly reduced explicit bias and moderately improving diverse gender representation. This work demonstrates how lightweight, language-aware prompt engineering can serve as an effective and reproducible strategy for bias assessment and mitigation in multilingual LLMs, contributing to the development of fairer AI systems.

Gender Bias in AI: Measuring and Debiasing Occupational Gender Representations in Large Language Models

ANTONY SAHAYAM, ANNE LINDA

2024/2025

Abstract

Gender bias in artificial intelligence (AI), particularly in large language models (LLM) such as LLaMA 2, presents a significant challenge, as these models are trained on vast datasets that often encode and reflect societal biases and gender inequalities. This thesis explores how LLaMA 2 (Llama-2-7b-hf) and LLaMA 3 (Llama-3.2-1B) exhibit both explicit and implicit gender bias in occupational predictions using a prompt-based probability evaluation framework. The models were probed using occupation-based templates and responses were evaluated based on gendered pronoun probabilities. Explicit bias is measured through direct responses to gender-neutral prompts, while implicit bias is assessed through the distribution of gendered pronouns in more conversational contexts. The study uniquely incorporates both English and Italian datasets, with the latter providing additional insights due to the gendered nature of the language. To address the gender imbalances, zero-shot debiasing was applied via instructional prompts, achieving significantly reduced explicit bias and moderately improving diverse gender representation. This work demonstrates how lightweight, language-aware prompt engineering can serve as an effective and reproducible strategy for bias assessment and mitigation in multilingual LLMs, contributing to the development of fairer AI systems.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				COMPUTER ENGINEERING Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Gender Bias in AI: Measuring and Debiasing Occupational Gender Representations in Large Language Models
			
	Abstract in italiano
	
				Gender bias in artificial intelligence (AI), particularly in large language models (LLM) such as
LLaMA 2, presents a significant challenge, as these models are trained on vast datasets that often encode and reflect societal biases and gender inequalities. This thesis explores how LLaMA
2 (Llama-2-7b-hf) and LLaMA 3 (Llama-3.2-1B) exhibit both explicit and implicit gender bias
in occupational predictions using a prompt-based probability evaluation framework. The models were probed using occupation-based templates and responses were evaluated based on gendered pronoun probabilities. Explicit bias is measured through direct responses to gender-neutral
prompts, while implicit bias is assessed through the distribution of gendered pronouns in more
conversational contexts. The study uniquely incorporates both English and Italian datasets, with
the latter providing additional insights due to the gendered nature of the language. To address the
gender imbalances, zero-shot debiasing was applied via instructional prompts, achieving significantly reduced explicit bias and moderately improving diverse gender representation. This work
demonstrates how lightweight, language-aware prompt engineering can serve as an effective and
reproducible strategy for bias assessment and mitigation in multilingual LLMs, contributing to
the development of fairer AI systems.
			
	Parola chiave
	
				Gender Bias
Large Language Model
AI Stereotypes
Bias Mitigation
Ethical AI
			
	Relatore
	
				RODA', ANTONIO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
AntonySahayam_AnneLinda.pdf accesso aperto Dimensione 6.67 MB Formato Adobe PDF Visualizza/Apri	6.67 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/87664