Training Spiking Neural Networks via the Alternating Direction Method of Multipliers (ADMM)

Spiking Neural Networks (SNNs) offer a promising, energy-efficient alternative to the current Deep Learning paradigm. Drawing inspiration from biological neurons, they enable sparse information transmission, which might be crucial in low-power and low-latency applications. However, their training presents significant challenges, mainly due to the non-differentiability of their activation functions, which prevents the direct use of gradient methods. Current training techniques, such as surrogate gradients, introduce approximations and suffer from issues that limit scalability and performance on deeper networks. This work presents the first iteration of a novel Alternating Direction Method of Multipliers (ADMM)-based optimizer specifically designed to address the non-differentiability of SNNs. By formulating SNN training as a constrained optimization problem, this framework allows for global minimization updates without relying on approximations of the activation function. A relaxed Lagrangian containing an exact constraint only on the dynamics of the last membrane potentials is derived from the problem and minimized via the ADMM. This results in an algorithm where each sub-problem is solved analytically in a closed form and that naturally extends to distributed and parallel learning. The proposed optimizer explicitly adapts previous ADMM applications in Artificial Neural Networks (ANNs) to the dynamics of SNNs, representing a gradient-free alternative to other supervised training techniques. Small-scale numerical simulations, carried out on two neuromorphic conversion of widely used datasets, the Neuromorphic MNIST (N-MNIST) and Spiking Heidelberg Digits (SHD), prove the optimizer's ability to achieve stable convergence. While the optimizer shows promising results on smaller datasets and shallower networks, particularly when introducing random permutations of the updates order, it also presents some scalability issues, sensitivity to hyperparameters, and numerical instability related to matrix inversions. Although the origins of the issues are not entirely clear, and additional investigation is needed, a close look at the results suggests room for possible improvements, including reformulating variable updates to better respect binary outputs, improving matrix inversion robustness, and introducing adaptive penalty coefficients to improve the optimizer's performance and scalability.

Training Spiking Neural Networks via the Alternating Direction Method of Multipliers (ADMM)

BIDINI, CESARE

2024/2025

Abstract

Spiking Neural Networks (SNNs) offer a promising, energy-efficient alternative to the current Deep Learning paradigm. Drawing inspiration from biological neurons, they enable sparse information transmission, which might be crucial in low-power and low-latency applications. However, their training presents significant challenges, mainly due to the non-differentiability of their activation functions, which prevents the direct use of gradient methods. Current training techniques, such as surrogate gradients, introduce approximations and suffer from issues that limit scalability and performance on deeper networks. This work presents the first iteration of a novel Alternating Direction Method of Multipliers (ADMM)-based optimizer specifically designed to address the non-differentiability of SNNs. By formulating SNN training as a constrained optimization problem, this framework allows for global minimization updates without relying on approximations of the activation function. A relaxed Lagrangian containing an exact constraint only on the dynamics of the last membrane potentials is derived from the problem and minimized via the ADMM. This results in an algorithm where each sub-problem is solved analytically in a closed form and that naturally extends to distributed and parallel learning. The proposed optimizer explicitly adapts previous ADMM applications in Artificial Neural Networks (ANNs) to the dynamics of SNNs, representing a gradient-free alternative to other supervised training techniques. Small-scale numerical simulations, carried out on two neuromorphic conversion of widely used datasets, the Neuromorphic MNIST (N-MNIST) and Spiking Heidelberg Digits (SHD), prove the optimizer's ability to achieve stable convergence. While the optimizer shows promising results on smaller datasets and shallower networks, particularly when introducing random permutations of the updates order, it also presents some scalability issues, sensitivity to hyperparameters, and numerical instability related to matrix inversions. Although the origins of the issues are not entirely clear, and additional investigation is needed, a close look at the results suggests room for possible improvements, including reformulating variable updates to better respect binary outputs, improving matrix inversion robustness, and introducing adaptive penalty coefficients to improve the optimizer's performance and scalability.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Training Spiking Neural Networks via the Alternating Direction Method of Multipliers (ADMM)
			
	Parola chiave
	
				SNN
ADMM
neural networks
spiking neural nets
			
	Relatore
	
				ROSSI, MICHELE
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Cesare_Bidini.pdf Accesso riservato Dimensione 6.14 MB Formato Adobe PDF	6.14 MB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/89826