On the Computation of Relative Flatness in Convolutional Layers

The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.

On the Computation of Relative Flatness in Convolutional Layers

TALEGHANI ZOLPIRANI, RAHMAN

2024/2025

Abstract

The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				MATHEMATICS Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				On the Computation of Relative Flatness in Convolutional Layers
			
	Abstract in italiano
	
				The goal of this thesis is to extend and apply the flatness measure proposed by Petzka et al.~(2021) within the framework of convolutional neural networks (CNNs). Flatness measure, derived from the trace of the Hessian of the loss function with respect to model parameters, demonstrates a strong correlation with the generalization capacity and robustness of deep learning models. While in prior work the computation is performed for fully-connected layers, this study shows the challenges and implications of computing relative flatness measure in convolutional layers, specifically when such a layer serves as the final layer in the network, preceding a softmax activation and cross-entropy loss. Due to the complex structural properties of convolutional layers, such as weight sharing and spatial locality, the computation of the Hessian matrix becomes significantly more complex. This thesis aims to study these computational challenges and provide an effective way to compute the reparameterization-invariant flatness measure in this case.
			
	Parola chiave
	
				Deep learning (DL)
Convolutional layers
Flatness measure
Generalization in DL
			
	Relatore
	
				MARCHETTI, FRANCESCO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Taleghani_Zolpirani_Rahman.pdf Accesso riservato Dimensione 1.41 MB Formato Adobe PDF	1.41 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/91987