Artificial Intelligence (AI) has become essential in sectors such as healthcare, finance, and trans- portation, enhancing data processing and decision-making capabilities. However, AI systems often exhibit biases, leading to unfair or discriminatory outcomes, particularly concerning race and gender. Addressing these biases is crucial for ensuring fairness and protecting diverse de- mographic groups. Our study aims to determine the effectiveness of k-anonymity and ε-differential privacy for AI debiasing. Using the Adult Income dataset from the UCI Machine Learning Repository, we implemented k-anonymity by generalizing and suppressing key attributes, and ε-differential privacy by adding controlled noise. We evaluated the impact of these techniques on AI model performance using Random Forest and XGBoost classifiers, focusing on accuracy, precision, recall, and F1-score. Fairness was assessed through metrics such as demographic parity, equal opportunity, and disparate impact, while privacy was evaluated using re-identification risk and the privacy loss parameter ε, which measures the added noise. Our findings indicate that ε-differential privacy significantly enhances fairness by reducing biases related to race and gender without substantially compromising model performance. In contrast, k-anonymity, while effective in protecting privacy, had a more varied impact on fair- ness and performance. A combined approach of k-anonymity and ε-differential privacy re- sulted in a balanced trade-off, maintaining high levels of privacy, fairness, and utility. Our research highlights the potential of differential privacy as a robust tool for ethical AI de- velopment, emphasizing the importance of selecting and implementing appropriate anonymiza- tion techniques to achieve optimal outcomes in AI systems.
Examining Anonymization Techniques for Effective AI Debiasing
KHAMIDOV, FATTOKH HAMID UGLI
2025/2026
Abstract
Artificial Intelligence (AI) has become essential in sectors such as healthcare, finance, and trans- portation, enhancing data processing and decision-making capabilities. However, AI systems often exhibit biases, leading to unfair or discriminatory outcomes, particularly concerning race and gender. Addressing these biases is crucial for ensuring fairness and protecting diverse de- mographic groups. Our study aims to determine the effectiveness of k-anonymity and ε-differential privacy for AI debiasing. Using the Adult Income dataset from the UCI Machine Learning Repository, we implemented k-anonymity by generalizing and suppressing key attributes, and ε-differential privacy by adding controlled noise. We evaluated the impact of these techniques on AI model performance using Random Forest and XGBoost classifiers, focusing on accuracy, precision, recall, and F1-score. Fairness was assessed through metrics such as demographic parity, equal opportunity, and disparate impact, while privacy was evaluated using re-identification risk and the privacy loss parameter ε, which measures the added noise. Our findings indicate that ε-differential privacy significantly enhances fairness by reducing biases related to race and gender without substantially compromising model performance. In contrast, k-anonymity, while effective in protecting privacy, had a more varied impact on fair- ness and performance. A combined approach of k-anonymity and ε-differential privacy re- sulted in a balanced trade-off, maintaining high levels of privacy, fairness, and utility. Our research highlights the potential of differential privacy as a robust tool for ethical AI de- velopment, emphasizing the importance of selecting and implementing appropriate anonymiza- tion techniques to achieve optimal outcomes in AI systems.| File | Dimensione | Formato | |
|---|---|---|---|
|
Fattokh_Thesis_December_2025.pdf
accesso aperto
Dimensione
719.75 kB
Formato
Adobe PDF
|
719.75 kB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/108180