Gated Linear Networks for Continual Learning in a Class-Incremental with Repetition Scenario

Continual learning, which involves the incremental acquisition of knowledge over time, is a challenging problem in complex environments where the distribution of data may change over time. Despite the great results obtained by neural networks in solving a great variety of tasks they still struggle in showing the same strong performance in a continual learning environment, suffering from a problem known as catastrophic forgetting. This problem, that consists in a model's tendency to overwrite old knowledge when new one is presented, has been dealt with through a variety of strategies that adapt the models on different levels. Among those, in this work we will focus on Gated Linear Networks (GLNs), a type of models that rely on a gating mechanism to improve the storage and retrieval of information over time. This class of models has already been applied to continual learning with promising results, but always in extremely simplified frameworks. In this work we will try to define a more complex continual learning environment and to adapt GLNs to the increased challenges that this environment will present, evaluating their strengths and their limitations. In particular, we found that performing an encoding step can help making a complex dataset more spatially separable and therefore making the GLNs more effective, and that switching to a Class-Incremental with Repetition scenario is useful both to increase the realism of the framework while easing the learning difficulty.

Il continual learning, che comporta l'acquisizione incrementale di conoscenze nel tempo, è un problema impegnativo in ambienti complessi in cui la distribuzione dei dati può cambiare nel tempo. Nonostante i grandi risultati ottenuti dalle reti neurali nel risolvere una grande varietà di compiti, fanno ancora fatica a raggiungere le stesse buone prestazioni in un ambiente di apprendimento continuo, soffrendo di un problema noto come catastrophic forgetting. Questo problema, che consiste nella tendenza di un modello a sovrascrivere vecchie informazioni quando ne vengono presentate di nuove, è stato affrontato attraverso una varietà di strategie che adattano il modello in diversi modi. Tra queste, in questo lavoro ci concentreremo sulle Gated Linear Networks (GLN), un tipo di modelli che si basano su un meccanismo di gating per migliorare l'archiviazione e il recupero delle informazioni nel tempo. Questa classe di modelli è già stata applicata al continual learning con risultati finora promettenti, ma sempre in framework estremamente semplificati. In questo lavoro cercheremo di definire un ambiente di apprendimento continuo più complesso e di adattare i GLN alle crescenti sfide che questo ambiente presenterà, valutandone i punti di forza ed i limiti. In particolare, abbiamo scoperto che la presenza di una fase di encoding può aiutare a rendere un set di dati complessi più spazialmente separabile e quindi rendere i GLN più efficaci, e che il passaggio a uno scenario Class-Incremental con Ripetizioni è utile sia per aumentare il realismo del framework sia per facilitare l'apprendimento.