Pregiudizi di genere nei Word Embeddings: verso un’analisi del gender score nei documenti testuali.

The following paper will propose a study of the main experiments conducted on Word Embeddings, with a focus on those related to WE in the Italian language and a following analysis of the gender score in text documents. After an introduction on the history and current status of the various studies carried out on the topic, we moved on to an experimental part aimed at determining the gender score of a textual document. An initial analysis was carried out using WEs and text analysis techniques such as the removal of stopwords and punctuation characters, moving on to the study of more advanced models such as n-grams and sentence embeddings. An attempt was made to demonstrate how through the use of these natural language analysis techniques, it is possible to determine the gender score of a textual document thus providing a tool for those who write such texts to produce more gender neutral documents by reducing gender stereotypes and biases.

Il seguente elaborato proporne uno studio dei principali esperimenti condotti sui Word Embeddings, con un’attenzione particolare a quelli relativi ai WE nella lingua italiana e una seguente analisi del gender score nei documenti testuali. Dopo un’introduzione sulla storia e lo stato dell’altre dei vari studi effettuati sull’argomento, si è passati ad una parte di sperimentazione volta a determinare il gender score di un documento testuale. E’ stata effettuata una prima analisi utilizzando i WE e le tecniche di analisi testuale come la rimozione di stopwords e caratteri di interpunzione, passando poi allo studio di modelli più avanzati come gli n-grams e i sentence embeddings. Si è cercato di dimostrare come attraverso l’utilizzo di queste tecniche di analisi del linguaggio naturale, sia possibile determinare il gender score di un documento testuale fornendo così uno strumento a chi scrive tali testi che permetta di produrre documenti più gender neutral riducendo stereotipi e pregiudizi di genere.

Pregiudizi di genere nei Word Embeddings: verso un’analisi del gender score nei documenti testuali.

FRISO, LUCA

2021/2022

Abstract

The following paper will propose a study of the main experiments conducted on Word Embeddings, with a focus on those related to WE in the Italian language and a following analysis of the gender score in text documents. After an introduction on the history and current status of the various studies carried out on the topic, we moved on to an experimental part aimed at determining the gender score of a textual document. An initial analysis was carried out using WEs and text analysis techniques such as the removal of stopwords and punctuation characters, moving on to the study of more advanced models such as n-grams and sentence embeddings. An attempt was made to demonstrate how through the use of these natural language analysis techniques, it is possible to determine the gender score of a textual document thus providing a tool for those who write such texts to produce more gender neutral documents by reducing gender stereotypes and biases.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				INGEGNERIA INFORMATICA Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2021
			
	Titolo inglese
	
				Gender bias in Word Embeddings: toward a gender score analysis in textual documents.
			
	Abstract in italiano
	
				Il seguente elaborato proporne uno studio dei principali esperimenti condotti sui Word Embeddings, con un’attenzione particolare a quelli relativi ai WE nella lingua italiana e una seguente analisi del gender score nei documenti testuali. Dopo un’introduzione sulla storia e lo stato dell’altre dei vari studi effettuati sull’argomento, si è passati ad una parte di sperimentazione volta a determinare il gender score di un documento testuale. E’ stata effettuata una prima analisi utilizzando i WE e le tecniche di analisi testuale come la rimozione di stopwords e caratteri di interpunzione, passando poi allo studio di modelli più avanzati come gli n-grams e i sentence embeddings. Si è cercato di dimostrare come attraverso l’utilizzo di queste tecniche di analisi del linguaggio naturale, sia possibile determinare il gender score di un documento testuale fornendo così uno strumento a chi scrive tali testi che permetta di produrre documenti più gender neutral riducendo stereotipi e pregiudizi di genere.
			
	Parola chiave
	
				Word Embeddings
gender bias
textual document
gender score
Italian WE
			
	Relatore
	
				RODA', ANTONIO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Friso_Luca.pdf accesso aperto Dimensione 3.04 MB Formato Adobe PDF Visualizza/Apri	3.04 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/40248