Differential Privacy Techniques in Federated Learning: Application to diabetic Retinopathy Image Processing

The advent of federated learning has opened new frontiers in the privacy-preserving analysis of medical data, enabling collaborative model training without direct data sharing. This is particularly critical in the realm of healthcare, where patient confidentiality and data protection are paramount. Traditional data anonymization techniques are often insufficient to protect privacy against sophisticated attacks that can re-identify individuals from anonymized datasets. Therefore, federated learning is needed as it allows model training on decentralized data, mitigating the risk of data leakage. This thesis explores the integration of differential privacy (DP) techniques into federated learning frameworks, focusing on the application to diabetic retinopathy (DR) image processing, a critical area in medical diagnostics where the early detection and classification of disease stages can significantly impact patient outcomes. It presents a comprehensive study comparing four models: centralized non-private machine learning and non-private federated learning as baseline models, alongside two differentially private federated learning models utilizing the Gaussian and Laplace mechanisms. The aim is to establish a trade-off between model utility and privacy preservation, which is crucial for deploying machine learning models in sensitive domains. For the differentially private models, we identify the optimal noise values for both the Gaussian and Laplace mechanisms that offer the best balance between accuracy and privacy. Furthermore, we compare the performance of these mechanisms by plotting accuracy values and assessing privacy through the quality of reconstructed images in an inversion attack simulation. Additionally, this research undertakes a critical evaluation of the system's security through the simulation of an inversion attack, which tests the robustness of the DP-enhanced federated learning models against potential attempts to reconstruct individual data points from aggregated data. This simulation considers a worst-case scenario where the attacker has high-level access, providing insights into how added noise affects the reconstructed images. Experimental results demonstrate that our DP-enhanced federated learning models achieve competitive accuracy in classifying diabetic retinopathy images while ensuring privacy guarantees and resilience against inversion attacks. The results show that by adding noise, the reconstructed images become less informative, yet the accuracy trade-offs remain relatively close to those of the baseline models. This research contributes to the field by providing empirical evidence of the feasibility of deploying differential privacy in federated learning for medical image analysis, suggesting that privacy-preserving federated learning can be both practical and effective, balancing the need for data security with the imperative of maintaining high-quality medical diagnostics.

Differential Privacy Techniques in Federated Learning: Application to diabetic Retinopathy Image Processing

SHAHBAZI, MAHSA

2023/2024

Abstract

The advent of federated learning has opened new frontiers in the privacy-preserving analysis of medical data, enabling collaborative model training without direct data sharing. This is particularly critical in the realm of healthcare, where patient confidentiality and data protection are paramount. Traditional data anonymization techniques are often insufficient to protect privacy against sophisticated attacks that can re-identify individuals from anonymized datasets. Therefore, federated learning is needed as it allows model training on decentralized data, mitigating the risk of data leakage. This thesis explores the integration of differential privacy (DP) techniques into federated learning frameworks, focusing on the application to diabetic retinopathy (DR) image processing, a critical area in medical diagnostics where the early detection and classification of disease stages can significantly impact patient outcomes. It presents a comprehensive study comparing four models: centralized non-private machine learning and non-private federated learning as baseline models, alongside two differentially private federated learning models utilizing the Gaussian and Laplace mechanisms. The aim is to establish a trade-off between model utility and privacy preservation, which is crucial for deploying machine learning models in sensitive domains. For the differentially private models, we identify the optimal noise values for both the Gaussian and Laplace mechanisms that offer the best balance between accuracy and privacy. Furthermore, we compare the performance of these mechanisms by plotting accuracy values and assessing privacy through the quality of reconstructed images in an inversion attack simulation. Additionally, this research undertakes a critical evaluation of the system's security through the simulation of an inversion attack, which tests the robustness of the DP-enhanced federated learning models against potential attempts to reconstruct individual data points from aggregated data. This simulation considers a worst-case scenario where the attacker has high-level access, providing insights into how added noise affects the reconstructed images. Experimental results demonstrate that our DP-enhanced federated learning models achieve competitive accuracy in classifying diabetic retinopathy images while ensuring privacy guarantees and resilience against inversion attacks. The results show that by adding noise, the reconstructed images become less informative, yet the accuracy trade-offs remain relatively close to those of the baseline models. This research contributes to the field by providing empirical evidence of the feasibility of deploying differential privacy in federated learning for medical image analysis, suggesting that privacy-preserving federated learning can be both practical and effective, balancing the need for data security with the imperative of maintaining high-quality medical diagnostics.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				ICT FOR INTERNET AND MULTIMEDIA - INGEGNERIA PER LE COMUNICAZIONI MULTIMEDIALI E INTERNET Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2023
			
	Titolo inglese
	
				Differential Privacy Techniques in Federated Learning: Application to diabetic Retinopathy Image Processing
			
	Abstract in italiano
	
				The advent of federated learning has opened new frontiers in the privacy-preserving analysis of medical data, enabling collaborative model training without direct data sharing. This is particularly critical in the realm of healthcare, where patient confidentiality and data protection are paramount. Traditional data anonymization techniques are often insufficient to protect privacy against sophisticated attacks that can re-identify individuals from anonymized datasets. Therefore, federated learning is needed as it allows model training on decentralized data, mitigating the risk of data leakage.

This thesis explores the integration of differential privacy (DP) techniques into federated learning frameworks, focusing on the application to diabetic retinopathy (DR) image processing, a critical area in medical diagnostics where the early detection and classification of disease stages can significantly impact patient outcomes.

It presents a comprehensive study comparing four models: centralized non-private machine learning and non-private federated learning as baseline models, alongside two differentially private federated learning models utilizing the Gaussian and Laplace mechanisms. The aim is to establish a trade-off between model utility and privacy preservation, which is crucial for deploying machine learning models in sensitive domains. For the differentially private models, we identify the optimal noise values for both the Gaussian and Laplace mechanisms that offer the best balance between accuracy and privacy. Furthermore, we compare the performance of these mechanisms by plotting accuracy values and assessing privacy through the quality of reconstructed images in an inversion attack simulation.

Additionally, this research undertakes a critical evaluation of the system's security through the simulation of an inversion attack, which tests the robustness of the DP-enhanced federated learning models against potential attempts to reconstruct individual data points from aggregated data. This simulation considers a worst-case scenario where the attacker has high-level access, providing insights into how added noise affects the reconstructed images.

Experimental results demonstrate that our DP-enhanced federated learning models achieve competitive accuracy in classifying diabetic retinopathy images while ensuring privacy guarantees and resilience against inversion attacks. The results show that by adding noise, the reconstructed images become less informative, yet the accuracy trade-offs remain relatively close to those of the baseline models. This research contributes to the field by providing empirical evidence of the feasibility of deploying differential privacy in federated learning for medical image analysis, suggesting that privacy-preserving federated learning can be both practical and effective, balancing the need for data security with the imperative of maintaining high-quality medical diagnostics.
			
	Parola chiave
	
				Federated Learning
Differential Privacy
Image Processing
Inversion Attack
Diabetic Retinopathy
			
	Relatore
	
				BATTISTI, FEDERICA
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Shahbazi_Mahsa.pdf accesso aperto Dimensione 3.36 MB Formato Adobe PDF Visualizza/Apri	3.36 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/77014