The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.

The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.

On the Computation of Relative Flatness in Convolutional Layers

TALEGHANI ZOLPIRANI, RAHMAN
2024/2025

Abstract

The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.
2024
On the Computation of Relative Flatness in Convolutional Layers
The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.
Deep learning (DL)
Convolutional layers
Flatness measure
Generalization in DL
File in questo prodotto:
File Dimensione Formato  
Taleghani_Zolpirani_Rahman.pdf

Accesso riservato

Dimensione 1.41 MB
Formato Adobe PDF
1.41 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/91987