Dataset inference per reti neurali generative

Generative Adversarial Networks (GANs) have had great success in the generation of artifical samples from datasets made of sensitive data which can't be disclosed publicly. These GANs, if released to the public, could allow an attacker to leak sensitive information from the GAN's training dataset. We analize a type of attack called Membership Inference Attack (MIA), which consists of determining the membership of a certain sample to the training set of the GAN. We analize the success of both black box and white box Membership Inference Attacks on GANs trained on MNIST and anime faces. We look for a relationship between the precision of the attacks and the several hyperparameters of the GANs such as: amount of images in the training set, number of epochs of training, quality of generated images, number of generated images available to the attacker. We show how an insufficient number of training images or an excessive number of training epochs causes overfitting in the GAN, which is then vulnerable to MIAs. We analize how the Fréchet's Inception Distance (FID) between the set of generated images and the original training set impacts on the success of the MIAs.

Le Generative Neural Networks (GANs) hanno avuto successo nella generazione di sample artificiali da dataset composti da dati sensibili, i quali non possono essere divulgati pubblicamente. Queste GAN, se rese pubbliche, possono permettere ad un attaccante di ricavare informazioni sul dataset di training. In questa tesi analizziamo il tipo di attacco detto Membership Inference Attack (MIA), che consiste nel determinare l'appartenenza di un dato al training set della GAN. Analizziamo il successo di attacchi di Membership Inference black box e white box su delle GAN addestrate sui dataset MNIST e anime faces. Cerchiamo una relazione tra la precisione degli attacchi e vari iperparametri delle GAN come quantità di immagini nel training set, numero di epoche di training, qualità delle immagini generate e numero di immagini generate disponibili all'attaccante. Mostriamo come un numero insufficiente di immagini di training o un numero troppo elevato di epoche causi overfitting nella GAN addestrata che è quindi vulnerabile a MIAs. Analizziamo come la Fréchet's Inception Distance (FID) tra le immagini generate e quelle del training set originale sia collegata con il successo dei MIAs.

Dataset inference per reti neurali generative

DE GOBBI, MATTEO

2022/2023

Abstract

Generative Adversarial Networks (GANs) have had great success in the generation of artifical samples from datasets made of sensitive data which can't be disclosed publicly. These GANs, if released to the public, could allow an attacker to leak sensitive information from the GAN's training dataset. We analize a type of attack called Membership Inference Attack (MIA), which consists of determining the membership of a certain sample to the training set of the GAN. We analize the success of both black box and white box Membership Inference Attacks on GANs trained on MNIST and anime faces. We look for a relationship between the precision of the attacks and the several hyperparameters of the GANs such as: amount of images in the training set, number of epochs of training, quality of generated images, number of generated images available to the attacker. We show how an insufficient number of training images or an excessive number of training epochs causes overfitting in the GAN, which is then vulnerable to MIAs. We analize how the Fréchet's Inception Distance (FID) between the set of generated images and the original training set impacts on the success of the MIAs.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Ingegneria dell'Informazione - DEI
			
	Corso di studio
	
				INGEGNERIA DELL'INFORMAZIONE Laurea di Primo Livello (D.M. 270/2004)
			
	Anno Accademico
	
				2022
			
	Titolo inglese
	
				Dataset inference on Generative Neural Networks
			
	Abstract in italiano
	
				Le Generative Neural Networks (GANs) hanno avuto successo nella generazione di sample artificiali da dataset composti da dati sensibili, i quali non possono essere divulgati pubblicamente. Queste GAN, se rese pubbliche, possono permettere ad un attaccante di ricavare informazioni sul dataset di training. In questa tesi analizziamo il tipo di attacco detto Membership Inference Attack (MIA), che consiste nel determinare l'appartenenza di un dato al training set della GAN. Analizziamo il successo di attacchi di Membership Inference black box e white box su delle GAN addestrate sui dataset MNIST e anime faces. Cerchiamo una relazione tra la precisione degli attacchi e vari iperparametri delle GAN come quantità di immagini nel training set, numero di epoche di training, qualità delle immagini generate e numero di immagini generate disponibili all'attaccante. Mostriamo come un numero insufficiente di immagini di training o un numero troppo elevato di epoche causi overfitting nella GAN addestrata che è quindi vulnerabile a MIAs. Analizziamo come la Fréchet's Inception Distance (FID) tra le immagini generate e quelle del training set originale sia collegata con il successo dei MIAs.
			
	Parola chiave
	
				GAN
Dataset inference
Membership inference
			
	Relatore
	
				MILANI, SIMONE
			
	Appare nelle tipologie:
	
				Lauree triennali

File in questo prodotto:

File	Dimensione	Formato
DeGobbi_Matteo.pdf accesso aperto Dimensione 2.16 MB Formato Adobe PDF Visualizza/Apri	2.16 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/57083