This thesis explores perturbed ensembles as a strategy to enhance the accuracy and robustness of deep neural networks by encouraging the exploration of multiple regions of the parameter space starting from a single pre-trained convolutional neural network (CNN). By applying targeted post-training weight perturbations, the base model is shifted towards diverse yet complementary local optima, enabling the ensemble to capture a broader variety of decision boundaries. The outputs of the perturbed models are then fused using the Sum Rule at the logit level, leveraging the combined strengths of the individual predictors. Three perturbation families are investigated: Gaussian and curvature-aware noise, Conditional Sign Flip, and Layer-wise Fisher Information perturbations. Their design is theoretically motivated by generalization bounds such as PAC-Bayes and Margin-Norm, and by recent work on improving prediction consistency and correct-consistency in deep ensembles. Extensive MATLAB-based experiments on multiple image classification datasets demonstrate that the proposed Perturbed Ensemble consistently surpasses the baseline model in terms of accuracy, prediction consistency, and correct-consistency. Statistical analysis confirms the significance of these improvements, while ablation studies highlight that combining heterogeneous perturbations yields superior results compared to any single strategy alone. These findings validate perturbed ensembles as an efficient, theoretically grounded, and post-training compatible approach for improving the generalization and reliability of deep learning models, making them a promising solution for scenarios where re- training from scratch is impractical and robustness under varying conditions is critical.

This thesis explores perturbed ensembles as a strategy to enhance the accuracy and robustness of deep neural networks by encouraging the exploration of multiple regions of the parameter space starting from a single pre-trained convolutional neural network (CNN). By applying targeted post-training weight perturbations, the base model is shifted towards diverse yet complementary local optima, enabling the ensemble to capture a broader variety of decision boundaries. The outputs of the perturbed models are then fused using the Sum Rule at the logit level, leveraging the combined strengths of the individual predictors. Three perturbation families are investigated: Gaussian and curvature-aware noise, Conditional Sign Flip, and Layer-wise Fisher Information perturbations. Their design is theoretically motivated by generalization bounds such as PAC-Bayes and Margin-Norm, and by recent work on improving prediction consistency and correct-consistency in deep ensembles. Extensive MATLAB-based experiments on multiple image classification datasets demonstrate that the proposed Perturbed Ensemble consistently surpasses the baseline model in terms of accuracy, prediction consistency, and correct-consistency. Statistical analysis confirms the significance of these improvements, while ablation studies highlight that combining heterogeneous perturbations yields superior results compared to any single strategy alone. These findings validate perturbed ensembles as an efficient, theoretically grounded, and post-training compatible approach for improving the generalization and reliability of deep learning models, making them a promising solution for scenarios where re- training from scratch is impractical and robustness under varying conditions is critical.

Weight Perturbation Techniques for CNN Ensembles: A Post-Training Approach to Improve Generalization

OSTI, SIMONE
2024/2025

Abstract

This thesis explores perturbed ensembles as a strategy to enhance the accuracy and robustness of deep neural networks by encouraging the exploration of multiple regions of the parameter space starting from a single pre-trained convolutional neural network (CNN). By applying targeted post-training weight perturbations, the base model is shifted towards diverse yet complementary local optima, enabling the ensemble to capture a broader variety of decision boundaries. The outputs of the perturbed models are then fused using the Sum Rule at the logit level, leveraging the combined strengths of the individual predictors. Three perturbation families are investigated: Gaussian and curvature-aware noise, Conditional Sign Flip, and Layer-wise Fisher Information perturbations. Their design is theoretically motivated by generalization bounds such as PAC-Bayes and Margin-Norm, and by recent work on improving prediction consistency and correct-consistency in deep ensembles. Extensive MATLAB-based experiments on multiple image classification datasets demonstrate that the proposed Perturbed Ensemble consistently surpasses the baseline model in terms of accuracy, prediction consistency, and correct-consistency. Statistical analysis confirms the significance of these improvements, while ablation studies highlight that combining heterogeneous perturbations yields superior results compared to any single strategy alone. These findings validate perturbed ensembles as an efficient, theoretically grounded, and post-training compatible approach for improving the generalization and reliability of deep learning models, making them a promising solution for scenarios where re- training from scratch is impractical and robustness under varying conditions is critical.
2024
Weight Perturbation Techniques for CNN Ensembles: A Post-Training Approach to Improve Generalization
This thesis explores perturbed ensembles as a strategy to enhance the accuracy and robustness of deep neural networks by encouraging the exploration of multiple regions of the parameter space starting from a single pre-trained convolutional neural network (CNN). By applying targeted post-training weight perturbations, the base model is shifted towards diverse yet complementary local optima, enabling the ensemble to capture a broader variety of decision boundaries. The outputs of the perturbed models are then fused using the Sum Rule at the logit level, leveraging the combined strengths of the individual predictors. Three perturbation families are investigated: Gaussian and curvature-aware noise, Conditional Sign Flip, and Layer-wise Fisher Information perturbations. Their design is theoretically motivated by generalization bounds such as PAC-Bayes and Margin-Norm, and by recent work on improving prediction consistency and correct-consistency in deep ensembles. Extensive MATLAB-based experiments on multiple image classification datasets demonstrate that the proposed Perturbed Ensemble consistently surpasses the baseline model in terms of accuracy, prediction consistency, and correct-consistency. Statistical analysis confirms the significance of these improvements, while ablation studies highlight that combining heterogeneous perturbations yields superior results compared to any single strategy alone. These findings validate perturbed ensembles as an efficient, theoretically grounded, and post-training compatible approach for improving the generalization and reliability of deep learning models, making them a promising solution for scenarios where re- training from scratch is impractical and robustness under varying conditions is critical.
Weight Perturbation
CNN
Ensembles
File in questo prodotto:
File Dimensione Formato  
Osti_ Simone_Computer_Engineering_MsC_Thesis_UniPD.pdf

embargo fino al 27/11/2026

Dimensione 7.3 MB
Formato Adobe PDF
7.3 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/98780