The following thesis proposes an analysis of some variants of the "Adam" optimizer (adaptive moment estimation) used mainly in the training of convolutional neural networks, also known as "CNN". This algorithm is derived from "SGD" (stocastic gradient descent) based optimizers, a method which, to decrease the error function of the classifier, updates the weights of the various neurons in the network using the gradient of a "Loss Function". Starting with the implementation and analysis of "AngularGrad" and "AdaInject", recently idealized variants of the "Adam" algorithm, they will then be used to create ensembles using variants already proposed in the literature.
La seguente tesi propone un'analisi di alcune varianti dell'ottimizzatore "Adam" (adaptive moment estimation) utilizzato principalmente nell'addestramento delle reti neurali convoluzionali, note anche come "CNN". Questo algoritmo deriva dagli ottimizzatori basati su "SGD" (stocastic gradient descent), un metodo che, per diminuire la funzione di errore del classificatore, aggiorna i pesi dei vari neuroni della rete utilizzando il gradiente di una "Loss Function". Partendo dall'implementazione e dall'analisi di "AngularGrad" e "AdaInject", varianti recentemente idealizzate dell'algoritmo "Adam", essi verranno poi utilizzati per creare ensemble utilizzando varianti già proposte in letteratura.
Implementation and study of ADAM-based algorithms
GIANNINI, LORENZO
2021/2022
Abstract
The following thesis proposes an analysis of some variants of the "Adam" optimizer (adaptive moment estimation) used mainly in the training of convolutional neural networks, also known as "CNN". This algorithm is derived from "SGD" (stocastic gradient descent) based optimizers, a method which, to decrease the error function of the classifier, updates the weights of the various neurons in the network using the gradient of a "Loss Function". Starting with the implementation and analysis of "AngularGrad" and "AdaInject", recently idealized variants of the "Adam" algorithm, they will then be used to create ensembles using variants already proposed in the literature.File | Dimensione | Formato | |
---|---|---|---|
Giannini_Lorenzo.pdf
accesso aperto
Dimensione
3.51 MB
Formato
Adobe PDF
|
3.51 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/34525