Spiking Neural Networks (SNNs) offer a promising, energy-efficient alternative to the current Deep Learning paradigm. Drawing inspiration from biological neurons, they enable sparse information transmission, which might be crucial in low-power and low-latency applications. However, their training presents significant challenges, mainly due to the non-differentiability of their activation functions, which prevents the direct use of gradient methods. Current training techniques, such as surrogate gradients, introduce approximations and suffer from issues that limit scalability and performance on deeper networks. This work presents the first iteration of a novel Alternating Direction Method of Multipliers (ADMM)-based optimizer specifically designed to address the non-differentiability of SNNs. By formulating SNN training as a constrained optimization problem, this framework allows for global minimization updates without relying on approximations of the activation function. A relaxed Lagrangian containing an exact constraint only on the dynamics of the last membrane potentials is derived from the problem and minimized via the ADMM. This results in an algorithm where each sub-problem is solved analytically in a closed form and that naturally extends to distributed and parallel learning. The proposed optimizer explicitly adapts previous ADMM applications in Artificial Neural Networks (ANNs) to the dynamics of SNNs, representing a gradient-free alternative to other supervised training techniques. Small-scale numerical simulations, carried out on two neuromorphic conversion of widely used datasets, the Neuromorphic MNIST (N-MNIST) and Spiking Heidelberg Digits (SHD), prove the optimizer's ability to achieve stable convergence. While the optimizer shows promising results on smaller datasets and shallower networks, particularly when introducing random permutations of the updates order, it also presents some scalability issues, sensitivity to hyperparameters, and numerical instability related to matrix inversions. Although the origins of the issues are not entirely clear, and additional investigation is needed, a close look at the results suggests room for possible improvements, including reformulating variable updates to better respect binary outputs, improving matrix inversion robustness, and introducing adaptive penalty coefficients to improve the optimizer's performance and scalability.
Training Spiking Neural Networks via the Alternating Direction Method of Multipliers (ADMM)
BIDINI, CESARE
2024/2025
Abstract
Spiking Neural Networks (SNNs) offer a promising, energy-efficient alternative to the current Deep Learning paradigm. Drawing inspiration from biological neurons, they enable sparse information transmission, which might be crucial in low-power and low-latency applications. However, their training presents significant challenges, mainly due to the non-differentiability of their activation functions, which prevents the direct use of gradient methods. Current training techniques, such as surrogate gradients, introduce approximations and suffer from issues that limit scalability and performance on deeper networks. This work presents the first iteration of a novel Alternating Direction Method of Multipliers (ADMM)-based optimizer specifically designed to address the non-differentiability of SNNs. By formulating SNN training as a constrained optimization problem, this framework allows for global minimization updates without relying on approximations of the activation function. A relaxed Lagrangian containing an exact constraint only on the dynamics of the last membrane potentials is derived from the problem and minimized via the ADMM. This results in an algorithm where each sub-problem is solved analytically in a closed form and that naturally extends to distributed and parallel learning. The proposed optimizer explicitly adapts previous ADMM applications in Artificial Neural Networks (ANNs) to the dynamics of SNNs, representing a gradient-free alternative to other supervised training techniques. Small-scale numerical simulations, carried out on two neuromorphic conversion of widely used datasets, the Neuromorphic MNIST (N-MNIST) and Spiking Heidelberg Digits (SHD), prove the optimizer's ability to achieve stable convergence. While the optimizer shows promising results on smaller datasets and shallower networks, particularly when introducing random permutations of the updates order, it also presents some scalability issues, sensitivity to hyperparameters, and numerical instability related to matrix inversions. Although the origins of the issues are not entirely clear, and additional investigation is needed, a close look at the results suggests room for possible improvements, including reformulating variable updates to better respect binary outputs, improving matrix inversion robustness, and introducing adaptive penalty coefficients to improve the optimizer's performance and scalability.| File | Dimensione | Formato | |
|---|---|---|---|
|
Cesare_Bidini.pdf
Accesso riservato
Dimensione
6.14 MB
Formato
Adobe PDF
|
6.14 MB | Adobe PDF |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/89826