The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.
The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.
On the Computation of Relative Flatness in Convolutional Layers
TALEGHANI ZOLPIRANI, RAHMAN
2024/2025
Abstract
The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.| File | Dimensione | Formato | |
|---|---|---|---|
|
Taleghani_Zolpirani_Rahman.pdf
Accesso riservato
Dimensione
1.41 MB
Formato
Adobe PDF
|
1.41 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/91987