In recent years, we have seen an explosion of activity in deep learning in both academia and industry. Deep Neural Networks (DNNs) significantly outperform previous machine learning techniques in various domains, e.g., image recognition, speech processing, and translation. However, the safety of DNNs has now been recognized as a realistic security concern. The basic concept of a backdoor attack is to hide a secret functionality in a system, in our case, a DNN. The system behaves as expected for most inputs, but malicious input activates the backdoor. Deep learning models can be trained and provided by third parties or outsourced to the cloud. The reason behind this practice is that the computational power required to train reliable models is not always available to engineers or small companies. Apart from outsourcing the training phase, another strategy used is transfer learning. In this case, an existing model is fine-tuned for a new task. These scenarios allow adversaries to manipulate model training to create backdoors. The thesis investigates different aspects of the broad scenario of backdoor attacks in DNNs. We present a new type of trigger that can be used in audio signals obtained using the echo. Smaller echoes (less than 1 ms) are not even audible to humans, but they can still be used as a trigger for command recognition systems. We showed that with this trigger, we could bypass STRIP-ViTA, a popular defence mechanism against backdoors. We also analyzed the neuron activations in backdoor models and designed a possible defence based on empirical observations. The neurons of the last layer of a DNN show high variance in their activations when the input samples contain the trigger. Finally, we analyzed and evaluated the blind backdoor attacks, which are backdoor attacks that are based on both code and data poisoning, and tested them with an untested defence. We also proposed a way to bypass the defence.
In recent years, we have seen an explosion of activity in deep learning in both academia and industry. Deep Neural Networks (DNNs) significantly outperform previous machine learning techniques in various domains, e.g., image recognition, speech processing, and translation. However, the safety of DNNs has now been recognized as a realistic security concern. The basic concept of a backdoor attack is to hide a secret functionality in a system, in our case, a DNN. The system behaves as expected for most inputs, but malicious input activates the backdoor. Deep learning models can be trained and provided by third parties or outsourced to the cloud. The reason behind this practice is that the computational power required to train reliable models is not always available to engineers or small companies. Apart from outsourcing the training phase, another strategy used is transfer learning. In this case, an existing model is fine-tuned for a new task. These scenarios allow adversaries to manipulate model training to create backdoors. The thesis investigates different aspects of the broad scenario of backdoor attacks in DNNs. We present a new type of trigger that can be used in audio signals obtained using the echo. Smaller echoes (less than 1 ms) are not even audible to humans, but they can still be used as a trigger for command recognition systems. We showed that with this trigger, we could bypass STRIP-ViTA, a popular defence mechanism against backdoors. We also analyzed the neuron activations in backdoor models and designed a possible defence based on empirical observations. The neurons of the last layer of a DNN show high variance in their activations when the input samples contain the trigger. Finally, we analyzed and evaluated the blind backdoor attacks, which are backdoor attacks that are based on both code and data poisoning, and tested them with an untested defence. We also proposed a way to bypass the defence.
Backdoor Attacks and Defences on Neural Networks
BELLUOMINI, MASSIMILIANO
2021/2022
Abstract
In recent years, we have seen an explosion of activity in deep learning in both academia and industry. Deep Neural Networks (DNNs) significantly outperform previous machine learning techniques in various domains, e.g., image recognition, speech processing, and translation. However, the safety of DNNs has now been recognized as a realistic security concern. The basic concept of a backdoor attack is to hide a secret functionality in a system, in our case, a DNN. The system behaves as expected for most inputs, but malicious input activates the backdoor. Deep learning models can be trained and provided by third parties or outsourced to the cloud. The reason behind this practice is that the computational power required to train reliable models is not always available to engineers or small companies. Apart from outsourcing the training phase, another strategy used is transfer learning. In this case, an existing model is fine-tuned for a new task. These scenarios allow adversaries to manipulate model training to create backdoors. The thesis investigates different aspects of the broad scenario of backdoor attacks in DNNs. We present a new type of trigger that can be used in audio signals obtained using the echo. Smaller echoes (less than 1 ms) are not even audible to humans, but they can still be used as a trigger for command recognition systems. We showed that with this trigger, we could bypass STRIP-ViTA, a popular defence mechanism against backdoors. We also analyzed the neuron activations in backdoor models and designed a possible defence based on empirical observations. The neurons of the last layer of a DNN show high variance in their activations when the input samples contain the trigger. Finally, we analyzed and evaluated the blind backdoor attacks, which are backdoor attacks that are based on both code and data poisoning, and tested them with an untested defence. We also proposed a way to bypass the defence.File | Dimensione | Formato | |
---|---|---|---|
Belluomini_Massimiliano.pdf
accesso aperto
Dimensione
9.53 MB
Formato
Adobe PDF
|
9.53 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/42054