1. ABSTRACT This experimental thesis investigates whether a large language model (ChatGPT) can guess gender from handwriting samples, and whether its misclassifications correlate with the sexual orientation of the writers. A total of 234 handwriting samples were analyzed, with the model asked to guess each writer's gender. The overall classification accuracy was 61%, with significantly higher accuracy for female samples (74%) than male samples (51%). In particular, the model was more accurate when classifying heterosexual individuals (64% for heterosexual men, 81% for heterosexual women) than homosexual individuals (41% for homosexual men, 55% for homosexual women). Homosexual men were misclassified as women 58% of the time, while homosexual women were misclassified as men 44% of the time. These results suggest that when individuals deviate from expected gendered handwriting patterns, possibly because of stylistic traits associated with sexual orientation, the model is more likely to be inaccurate. Handwriting feature analysis revealed statistically significant gender effects for rounded letters (p < .001), consistent letter size (p < .001) even spacing (p < .001), and angular forms (p < .001). When stratified by sexual orientation, significant effects emerged, particularly for consistent angular forms (p =.002) and rounded letters (p =.019), with homosexual participants more likely to exhibit these traits. The findings raise important ethical considerations: if AI systems can indirectly infer sensitive attributes like sexual orientation through behavioral cues such as handwriting, there are major potential implications for privacy, profiling, and potential discrimination.
1. ABSTRACT This experimental thesis investigates whether a large language model (ChatGPT) can guess gender from handwriting samples, and whether its misclassifications correlate with the sexual orientation of the writers. A total of 234 handwriting samples were analyzed, with the model asked to guess each writer's gender. The overall classification accuracy was 61%, with significantly higher accuracy for female samples (74%) than male samples (51%). In particular, the model was more accurate when classifying heterosexual individuals (64% for heterosexual men, 81% for heterosexual women) than homosexual individuals (41% for homosexual men, 55% for homosexual women). Homosexual men were misclassified as women 58% of the time, while homosexual women were misclassified as men 44% of the time. These results suggest that when individuals deviate from expected gendered handwriting patterns, possibly because of stylistic traits associated with sexual orientation, the model is more likely to be inaccurate. Handwriting feature analysis revealed statistically significant gender effects for rounded letters (p < .001), consistent letter size (p < .001) even spacing (p < .001), and angular forms (p < .001). When stratified by sexual orientation, significant effects emerged, particularly for consistent angular forms (p =.002) and rounded letters (p =.019), with homosexual participants more likely to exhibit these traits. The findings raise important ethical considerations: if AI systems can indirectly infer sensitive attributes like sexual orientation through behavioral cues such as handwriting, there are major potential implications for privacy, profiling, and potential discrimination.
Inferring Gender and Sexual Orientation from Handwriting Samples Using ChatGPT
POP, INGRID RALUCA
2024/2025
Abstract
1. ABSTRACT This experimental thesis investigates whether a large language model (ChatGPT) can guess gender from handwriting samples, and whether its misclassifications correlate with the sexual orientation of the writers. A total of 234 handwriting samples were analyzed, with the model asked to guess each writer's gender. The overall classification accuracy was 61%, with significantly higher accuracy for female samples (74%) than male samples (51%). In particular, the model was more accurate when classifying heterosexual individuals (64% for heterosexual men, 81% for heterosexual women) than homosexual individuals (41% for homosexual men, 55% for homosexual women). Homosexual men were misclassified as women 58% of the time, while homosexual women were misclassified as men 44% of the time. These results suggest that when individuals deviate from expected gendered handwriting patterns, possibly because of stylistic traits associated with sexual orientation, the model is more likely to be inaccurate. Handwriting feature analysis revealed statistically significant gender effects for rounded letters (p < .001), consistent letter size (p < .001) even spacing (p < .001), and angular forms (p < .001). When stratified by sexual orientation, significant effects emerged, particularly for consistent angular forms (p =.002) and rounded letters (p =.019), with homosexual participants more likely to exhibit these traits. The findings raise important ethical considerations: if AI systems can indirectly infer sensitive attributes like sexual orientation through behavioral cues such as handwriting, there are major potential implications for privacy, profiling, and potential discrimination.| File | Dimensione | Formato | |
|---|---|---|---|
|
Thesis FINAL.pdf
Accesso riservato
Dimensione
616.81 kB
Formato
Adobe PDF
|
616.81 kB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/91090