Modeling Spatio-Temporal Data via Learnable Axonal Delays in Convolutional Recurrent Spiking Neural Networks

Spiking Neural Networks (SNNs) have emerged as a promising and more energy-efficient paradigm for spatiotemporal processing compared to standard Artificial Neural Networks (ANNs), as they process information through sparse, binary discrete events called "spikes", whereas ANNs rely on dense, continuous real-valued computations that are typically more energy-intensive. Despite these advantages, effectively capturing complex temporal dynamics remains a significant challenge. In this work, we investigate the integration of two established and high-performing mechanisms: the Convolutional Spiking Gated Recurrent Unit (CS-GRU) architecture and "learnable axonal delays". While CS-GRU models are well-suited to handle long-term dependencies and extract local spatiotemporal features through convolutional operations, learnable delays provide a complementary mechanism for fine-grained temporal alignment of asynchronous spike events. Building on these observations, we propose an architecture that embeds learnable axonal delays directly within the recurrent core of a CS-GRU. This design aims to jointly leverage structured spatial feature extraction, recurrent memory, and adaptive temporal synchronization within a unified framework. We conduct an extensive empirical evaluation on three challenging spatiotemporal benchmarks: the Spiking Heidelberg Digits (SHD), Neuromorphic-TIDIGITS (N-TIDIGITS), and Spiking Speech Commands (SSC) datasets. Our analysis focuses on both the integrability and scalability of the proposed approach. Results indicate that a simple single-layer integration struggles to effectively reconcile the coupled optimization of weights and delay parameters, often leading to sub-optimal convergence. In contrast, the transition to a two-layer architecture yields improved classification accuracy and more stable training dynamics. Although the proposed method does not yet surpass current state-of-the-art performance, this work provides a detailed investigation of the underlying architectural bottlenecks and the interaction between recurrent delay learning and Convolutional Spiking GRU architecture. Our findings suggest that decoupling convolutional, recurrent, and delay-based operations into specialized components may be a crucial design principle for future delay-augmented SNNs, offering a promising direction for the development of more expressive and efficient neuromorphic systems.

Modeling Spatio-Temporal Data via Learnable Axonal Delays in Convolutional Recurrent Spiking Neural Networks

PLANTAMURA, DOMENICO

2025/2026

Abstract

Spiking Neural Networks (SNNs) have emerged as a promising and more energy-efficient paradigm for spatiotemporal processing compared to standard Artificial Neural Networks (ANNs), as they process information through sparse, binary discrete events called "spikes", whereas ANNs rely on dense, continuous real-valued computations that are typically more energy-intensive. Despite these advantages, effectively capturing complex temporal dynamics remains a significant challenge. In this work, we investigate the integration of two established and high-performing mechanisms: the Convolutional Spiking Gated Recurrent Unit (CS-GRU) architecture and "learnable axonal delays". While CS-GRU models are well-suited to handle long-term dependencies and extract local spatiotemporal features through convolutional operations, learnable delays provide a complementary mechanism for fine-grained temporal alignment of asynchronous spike events. Building on these observations, we propose an architecture that embeds learnable axonal delays directly within the recurrent core of a CS-GRU. This design aims to jointly leverage structured spatial feature extraction, recurrent memory, and adaptive temporal synchronization within a unified framework. We conduct an extensive empirical evaluation on three challenging spatiotemporal benchmarks: the Spiking Heidelberg Digits (SHD), Neuromorphic-TIDIGITS (N-TIDIGITS), and Spiking Speech Commands (SSC) datasets. Our analysis focuses on both the integrability and scalability of the proposed approach. Results indicate that a simple single-layer integration struggles to effectively reconcile the coupled optimization of weights and delay parameters, often leading to sub-optimal convergence. In contrast, the transition to a two-layer architecture yields improved classification accuracy and more stable training dynamics. Although the proposed method does not yet surpass current state-of-the-art performance, this work provides a detailed investigation of the underlying architectural bottlenecks and the interaction between recurrent delay learning and Convolutional Spiking GRU architecture. Our findings suggest that decoupling convolutional, recurrent, and delay-based operations into specialized components may be a crucial design principle for future delay-augmented SNNs, offering a promising direction for the development of more expressive and efficient neuromorphic systems.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE  Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2025
			
	Titolo inglese
	
				Modeling Spatio-Temporal Data via Learnable Axonal Delays in Convolutional Recurrent Spiking Neural Networks
			
	Abstract in italiano
	
				Spiking Neural Networks (SNNs) have emerged as a promising and more energy-efficient paradigm for spatiotemporal processing compared to standard Artificial Neural Networks (ANNs), as they process information through sparse, binary discrete events called "spikes", whereas ANNs rely on dense, continuous real-valued computations that are typically more energy-intensive. Despite these advantages, effectively capturing complex temporal dynamics remains a significant challenge. 
In this work, we investigate the integration of two established and high-performing mechanisms: the Convolutional Spiking Gated Recurrent Unit (CS-GRU) architecture and "learnable axonal delays". While CS-GRU models are well-suited to handle long-term dependencies and extract local spatiotemporal features through convolutional operations, learnable delays provide a complementary mechanism for fine-grained temporal alignment of asynchronous spike events.
Building on these observations, we propose an architecture that embeds learnable axonal delays directly within the recurrent core of a CS-GRU. This design aims to jointly leverage structured spatial feature extraction, recurrent memory, and adaptive temporal synchronization within a unified framework.
We conduct an extensive empirical evaluation on three challenging spatiotemporal benchmarks: the Spiking Heidelberg Digits (SHD), Neuromorphic-TIDIGITS (N-TIDIGITS), and Spiking Speech Commands (SSC) datasets. 
Our analysis focuses on both the integrability and scalability of the proposed approach. Results indicate that a simple single-layer integration struggles to effectively reconcile the coupled optimization of weights and delay parameters, often leading to sub-optimal convergence. In contrast, the transition to a two-layer architecture yields improved classification accuracy and more stable training dynamics.
Although the proposed method does not yet surpass current state-of-the-art performance, this work provides a detailed investigation of the underlying architectural bottlenecks and the interaction between recurrent delay learning and Convolutional Spiking GRU architecture. Our findings suggest that decoupling convolutional, recurrent, and delay-based operations into specialized components may be a crucial design principle for future delay-augmented SNNs, offering a promising direction for the development of more expressive and efficient neuromorphic systems.
			
	Parola chiave
	
				SNN
Axonal Delays
Spatio-Temporal
			
	Relatore
	
				ROSSI, MICHELE
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Plantamura_Domenico.pdf accesso aperto Dimensione 1.3 MB Formato Adobe PDF Visualizza/Apri	1.3 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/108237