Extending Gated Linear Networks for Continual Learning

To incrementally learn multiple tasks from an indefinitely long stream of data is a real challenge for traditional machine learning models. If not carefully controlled, the learning of new knowledge strongly impacts on a model’s learned abilities, making it to forget how to solve past tasks. Continual learning faces this problem, called catastrophic forgetting, developing models able to continually learn new tasks and adapt to changes in the data distribution. In this dissertation, we consider the recently proposed family of continual learning models, called Gated Linear Networks (GLNs), and study two crucial aspects impacting on the amount of catastrophic forgetting affecting gated linear networks, namely, data standardization and gating mechanism. Data standardization is particularly challenging in the online/continual learning setting because data from future tasks is not available beforehand. The results obtained using an online standardization method show a considerably higher amount of forgetting compared to an offline –static– standardization. Interestingly, with the latter standardization, we observe that GLNs show almost no forgetting on the considered benchmark datasets. Secondly, for an effective GLNs, it is essential to tailor the hyperparameters of the gating mechanism to the data distribution. In this dissertation, we propose a gating strategy based on a set of prototypes and the resulting Voronoi tessellation. The experimental assessment shows that, in an ideal setting where the data distribution is known, the proposed approach is more robust to different data standardizations compared to the original one, based on a halfspace gating mechanism, and shows improved predictive performance. Finally, we propose an adaptive mechanism for the choice of prototypes, which expands and shrinks the set of prototypes in an online fashion, making the model suitable for practical continual learning applications. The experimental results show that the adaptive model performances are close to the ideal scenario where prototypes are directly sampled from the data distribution.

Extending Gated Linear Networks for Continual Learning

MUNARI, MATTEO

2021/2022

Abstract

To incrementally learn multiple tasks from an indefinitely long stream of data is a real challenge for traditional machine learning models. If not carefully controlled, the learning of new knowledge strongly impacts on a model’s learned abilities, making it to forget how to solve past tasks. Continual learning faces this problem, called catastrophic forgetting, developing models able to continually learn new tasks and adapt to changes in the data distribution. In this dissertation, we consider the recently proposed family of continual learning models, called Gated Linear Networks (GLNs), and study two crucial aspects impacting on the amount of catastrophic forgetting affecting gated linear networks, namely, data standardization and gating mechanism. Data standardization is particularly challenging in the online/continual learning setting because data from future tasks is not available beforehand. The results obtained using an online standardization method show a considerably higher amount of forgetting compared to an offline –static– standardization. Interestingly, with the latter standardization, we observe that GLNs show almost no forgetting on the considered benchmark datasets. Secondly, for an effective GLNs, it is essential to tailor the hyperparameters of the gating mechanism to the data distribution. In this dissertation, we propose a gating strategy based on a set of prototypes and the resulting Voronoi tessellation. The experimental assessment shows that, in an ideal setting where the data distribution is known, the proposed approach is more robust to different data standardizations compared to the original one, based on a halfspace gating mechanism, and shows improved predictive performance. Finally, we propose an adaptive mechanism for the choice of prototypes, which expands and shrinks the set of prototypes in an online fashion, making the model suitable for practical continual learning applications. The experimental results show that the adaptive model performances are close to the ideal scenario where prototypes are directly sampled from the data distribution.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				INFORMATICA Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2021
			
	Titolo inglese
	
				Extending Gated Linear Networks for Continual Learning
			
	Abstract in italiano
	
				To incrementally learn multiple tasks from an indefinitely long stream of data
is a real challenge for traditional machine learning models. If not carefully
controlled, the learning of new knowledge strongly impacts on a model’s learned
abilities, making it to forget how to solve past tasks.
Continual learning faces this problem, called catastrophic forgetting, developing
models able to continually learn new tasks and adapt to changes in the
data distribution.
In this dissertation, we consider the recently proposed family of continual
learning models, called Gated Linear Networks (GLNs), and study two crucial
aspects impacting on the amount of catastrophic forgetting affecting gated linear
networks, namely, data standardization and gating mechanism.
Data standardization is particularly challenging in the online/continual learning
setting because data from future tasks is not available beforehand. The
results obtained using an online standardization method show a considerably
higher amount of forgetting compared to an offline –static– standardization.
Interestingly, with the latter standardization, we observe that GLNs show almost
no forgetting on the considered benchmark datasets.
Secondly, for an effective GLNs, it is essential to tailor the hyperparameters
of the gating mechanism to the data distribution. In this dissertation, we propose
a gating strategy based on a set of prototypes and the resulting Voronoi
tessellation. The experimental assessment shows that, in an ideal setting where
the data distribution is known, the proposed approach is more robust to different
data standardizations compared to the original one, based on a halfspace
gating mechanism, and shows improved predictive performance.
Finally, we propose an adaptive mechanism for the choice of prototypes,
which expands and shrinks the set of prototypes in an online fashion, making the
model suitable for practical continual learning applications. The experimental
results show that the adaptive model performances are close to the ideal scenario
where prototypes are directly sampled from the data distribution.
			
	Parola chiave
	
				Continual Learning
Gated Linear Network
Growing Neural Gas
			
	Relatore
	
				NAVARIN, NICOLO'
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Munari_Matteo.pdf accesso aperto Dimensione 1.71 MB Formato Adobe PDF Visualizza/Apri	1.71 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/29690