Dynamical low-rank training of neural networks

Neural networks have achieved tremendous success in a large variety of applications. However, their space and time computational demand can limit their usage in resource limited devices. At the same time, overparametrization seems to be necessary in order to overcome the highly non-convex nature of the training optimization problem. An optimal trade-off is then to be found in order to reduce networks' dimension while mantaining high performance. Popular approaches in the current literature are based on pruning techniques that look for subnetworks able to mantain approximately the initial performance. Nevertheless, these techniques often are not able to reduce the memory footprint of the training phase. In this thesis we will present DLRT, a training algorithm that looks for "low-rank subnetworks" by using DLRA theory and techniques. These subnetworks and their ranks are determined and adapted already during the training phase, allowing the overall time and memory resources required by both training and evaluation phases to be reduced significantly.

Dynamical low-rank training of neural networks

ZANGRANDO, EMANUELE

2021/2022

Abstract

Neural networks have achieved tremendous success in a large variety of applications. However, their space and time computational demand can limit their usage in resource limited devices. At the same time, overparametrization seems to be necessary in order to overcome the highly non-convex nature of the training optimization problem. An optimal trade-off is then to be found in order to reduce networks' dimension while mantaining high performance. Popular approaches in the current literature are based on pruning techniques that look for subnetworks able to mantain approximately the initial performance. Nevertheless, these techniques often are not able to reduce the memory footprint of the training phase. In this thesis we will present DLRT, a training algorithm that looks for "low-rank subnetworks" by using DLRA theory and techniques. These subnetworks and their ranks are determined and adapted already during the training phase, allowing the overall time and memory resources required by both training and evaluation phases to be reduced significantly.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2021
			
	Titolo inglese
	
				Dynamical low-rank training of neural networks
			
	Abstract in italiano
	
				Neural networks have achieved tremendous success in a large variety of applications. However, their space and time computational demand can limit their usage in resource limited devices. 
At the same time, overparametrization seems to be necessary in order to overcome the highly non-convex nature of the training optimization problem. An optimal trade-off is then to be found in order to reduce networks' dimension while mantaining high performance. Popular approaches in the current literature are based on pruning techniques that look for subnetworks able to mantain approximately the initial performance.

Nevertheless, these techniques often are not able to reduce the memory footprint of the training phase.
In this thesis we will present DLRT, a training algorithm that looks for "low-rank subnetworks" by using DLRA theory and techniques.
These subnetworks and their ranks are determined and adapted already during the training phase, allowing the overall
time and memory resources required by both training and evaluation phases to be reduced significantly.
			
	Parola chiave
	
				Low-rank
Deep learning
Numerical analysis
Neural networks
Efficient training
			
	Relatore
	
				RINALDI, FRANCESCO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Zangrando_Emanuele.pdf accesso aperto Dimensione 1.04 MB Formato Adobe PDF Visualizza/Apri	1.04 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/34907