Spiking Neural Networks (SNNs) have emerged as a promising and more energy-efficient paradigm for spatiotemporal processing compared to standard Artificial Neural Networks (ANNs), as they process information through sparse, binary discrete events called "spikes", whereas ANNs rely on dense, continuous real-valued computations that are typically more energy-intensive. Despite these advantages, effectively capturing complex temporal dynamics remains a significant challenge. In this work, we investigate the integration of two established and high-performing mechanisms: the Convolutional Spiking Gated Recurrent Unit (CS-GRU) architecture and "learnable axonal delays". While CS-GRU models are well-suited to handle long-term dependencies and extract local spatiotemporal features through convolutional operations, learnable delays provide a complementary mechanism for fine-grained temporal alignment of asynchronous spike events. Building on these observations, we propose an architecture that embeds learnable axonal delays directly within the recurrent core of a CS-GRU. This design aims to jointly leverage structured spatial feature extraction, recurrent memory, and adaptive temporal synchronization within a unified framework. We conduct an extensive empirical evaluation on three challenging spatiotemporal benchmarks: the Spiking Heidelberg Digits (SHD), Neuromorphic-TIDIGITS (N-TIDIGITS), and Spiking Speech Commands (SSC) datasets. Our analysis focuses on both the integrability and scalability of the proposed approach. Results indicate that a simple single-layer integration struggles to effectively reconcile the coupled optimization of weights and delay parameters, often leading to sub-optimal convergence. In contrast, the transition to a two-layer architecture yields improved classification accuracy and more stable training dynamics. Although the proposed method does not yet surpass current state-of-the-art performance, this work provides a detailed investigation of the underlying architectural bottlenecks and the interaction between recurrent delay learning and Convolutional Spiking GRU architecture. Our findings suggest that decoupling convolutional, recurrent, and delay-based operations into specialized components may be a crucial design principle for future delay-augmented SNNs, offering a promising direction for the development of more expressive and efficient neuromorphic systems.

Spiking Neural Networks (SNNs) have emerged as a promising and more energy-efficient paradigm for spatiotemporal processing compared to standard Artificial Neural Networks (ANNs), as they process information through sparse, binary discrete events called "spikes", whereas ANNs rely on dense, continuous real-valued computations that are typically more energy-intensive. Despite these advantages, effectively capturing complex temporal dynamics remains a significant challenge. In this work, we investigate the integration of two established and high-performing mechanisms: the Convolutional Spiking Gated Recurrent Unit (CS-GRU) architecture and "learnable axonal delays". While CS-GRU models are well-suited to handle long-term dependencies and extract local spatiotemporal features through convolutional operations, learnable delays provide a complementary mechanism for fine-grained temporal alignment of asynchronous spike events. Building on these observations, we propose an architecture that embeds learnable axonal delays directly within the recurrent core of a CS-GRU. This design aims to jointly leverage structured spatial feature extraction, recurrent memory, and adaptive temporal synchronization within a unified framework. We conduct an extensive empirical evaluation on three challenging spatiotemporal benchmarks: the Spiking Heidelberg Digits (SHD), Neuromorphic-TIDIGITS (N-TIDIGITS), and Spiking Speech Commands (SSC) datasets. Our analysis focuses on both the integrability and scalability of the proposed approach. Results indicate that a simple single-layer integration struggles to effectively reconcile the coupled optimization of weights and delay parameters, often leading to sub-optimal convergence. In contrast, the transition to a two-layer architecture yields improved classification accuracy and more stable training dynamics. Although the proposed method does not yet surpass current state-of-the-art performance, this work provides a detailed investigation of the underlying architectural bottlenecks and the interaction between recurrent delay learning and Convolutional Spiking GRU architecture. Our findings suggest that decoupling convolutional, recurrent, and delay-based operations into specialized components may be a crucial design principle for future delay-augmented SNNs, offering a promising direction for the development of more expressive and efficient neuromorphic systems.

Modeling Spatio-Temporal Data via Learnable Axonal Delays in Convolutional Recurrent Spiking Neural Networks

PLANTAMURA, DOMENICO
2025/2026

Abstract

Spiking Neural Networks (SNNs) have emerged as a promising and more energy-efficient paradigm for spatiotemporal processing compared to standard Artificial Neural Networks (ANNs), as they process information through sparse, binary discrete events called "spikes", whereas ANNs rely on dense, continuous real-valued computations that are typically more energy-intensive. Despite these advantages, effectively capturing complex temporal dynamics remains a significant challenge. In this work, we investigate the integration of two established and high-performing mechanisms: the Convolutional Spiking Gated Recurrent Unit (CS-GRU) architecture and "learnable axonal delays". While CS-GRU models are well-suited to handle long-term dependencies and extract local spatiotemporal features through convolutional operations, learnable delays provide a complementary mechanism for fine-grained temporal alignment of asynchronous spike events. Building on these observations, we propose an architecture that embeds learnable axonal delays directly within the recurrent core of a CS-GRU. This design aims to jointly leverage structured spatial feature extraction, recurrent memory, and adaptive temporal synchronization within a unified framework. We conduct an extensive empirical evaluation on three challenging spatiotemporal benchmarks: the Spiking Heidelberg Digits (SHD), Neuromorphic-TIDIGITS (N-TIDIGITS), and Spiking Speech Commands (SSC) datasets. Our analysis focuses on both the integrability and scalability of the proposed approach. Results indicate that a simple single-layer integration struggles to effectively reconcile the coupled optimization of weights and delay parameters, often leading to sub-optimal convergence. In contrast, the transition to a two-layer architecture yields improved classification accuracy and more stable training dynamics. Although the proposed method does not yet surpass current state-of-the-art performance, this work provides a detailed investigation of the underlying architectural bottlenecks and the interaction between recurrent delay learning and Convolutional Spiking GRU architecture. Our findings suggest that decoupling convolutional, recurrent, and delay-based operations into specialized components may be a crucial design principle for future delay-augmented SNNs, offering a promising direction for the development of more expressive and efficient neuromorphic systems.
2025
Modeling Spatio-Temporal Data via Learnable Axonal Delays in Convolutional Recurrent Spiking Neural Networks
Spiking Neural Networks (SNNs) have emerged as a promising and more energy-efficient paradigm for spatiotemporal processing compared to standard Artificial Neural Networks (ANNs), as they process information through sparse, binary discrete events called "spikes", whereas ANNs rely on dense, continuous real-valued computations that are typically more energy-intensive. Despite these advantages, effectively capturing complex temporal dynamics remains a significant challenge. In this work, we investigate the integration of two established and high-performing mechanisms: the Convolutional Spiking Gated Recurrent Unit (CS-GRU) architecture and "learnable axonal delays". While CS-GRU models are well-suited to handle long-term dependencies and extract local spatiotemporal features through convolutional operations, learnable delays provide a complementary mechanism for fine-grained temporal alignment of asynchronous spike events. Building on these observations, we propose an architecture that embeds learnable axonal delays directly within the recurrent core of a CS-GRU. This design aims to jointly leverage structured spatial feature extraction, recurrent memory, and adaptive temporal synchronization within a unified framework. We conduct an extensive empirical evaluation on three challenging spatiotemporal benchmarks: the Spiking Heidelberg Digits (SHD), Neuromorphic-TIDIGITS (N-TIDIGITS), and Spiking Speech Commands (SSC) datasets. Our analysis focuses on both the integrability and scalability of the proposed approach. Results indicate that a simple single-layer integration struggles to effectively reconcile the coupled optimization of weights and delay parameters, often leading to sub-optimal convergence. In contrast, the transition to a two-layer architecture yields improved classification accuracy and more stable training dynamics. Although the proposed method does not yet surpass current state-of-the-art performance, this work provides a detailed investigation of the underlying architectural bottlenecks and the interaction between recurrent delay learning and Convolutional Spiking GRU architecture. Our findings suggest that decoupling convolutional, recurrent, and delay-based operations into specialized components may be a crucial design principle for future delay-augmented SNNs, offering a promising direction for the development of more expressive and efficient neuromorphic systems.
SNN
Axonal Delays
Spatio-Temporal
File in questo prodotto:
File Dimensione Formato  
Plantamura_Domenico.pdf

accesso aperto

Dimensione 1.3 MB
Formato Adobe PDF
1.3 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/108237