STAC V1: End-to-End Training Pipeline
Overview
STAC V1 represents the original research approach - a complete end-to-end training pipeline for spiking transformers. This version established the foundational concepts that were later adapted for the conversion-based approach in STAC V2.
Key Differences: V1 vs V2
Aspect |
STAC V1 |
STAC V2 |
Approach |
End-to-end training from scratch |
ANNβSNN conversion |
Architecture |
Learnable AdEx neurons |
Converted transformer layers |
Memory |
Hyperdimensional Memory Module (HEMM) |
Temporal Spike Processor (TSP) |
Training |
Surrogate gradient training |
Pre-trained model conversion |
Scope |
Single-turn processing |
Multi-turn conversations |
Status |
Complete research prototype |
Experimental conversion framework |
STAC V1 Contributions
π§ Neuromorphic Architecture
- Learnable AdEx Neurons: Adaptive exponential neurons with biologically plausible parameters
- Surrogate Gradient Training: Successful training of spiking transformers using surrogate gradients
- L1 Spike Regularization: Energy-efficient spike patterns
π§© Memory Integration
- Hyperdimensional Memory Module (HEMM): 1024-dimensional memory projection
- Spike Pooling: Temporal aggregation of spike trains
- Memory Bias: Context-aware processing
π Validation Suite
- Comprehensive Testing: Position ID boundaries, attention masks, spike rates
- Energy Analysis: Theoretical energy savings projections
- Quality Metrics: Perplexity and coherence measurements
Implementation Details
Model Architecture
# Key components in stacv1.ipynb:
- AdEx neurons with learnable parameters (Ο_m=20.0, Ο_w=144.0, etc.)
- HEMM with 1024-dim projection matrix
- L1 regularization for energy efficiency
- Surrogate gradient training on WikiText-2
Training Process
- Data Loading: WikiText-2 raw dataset
- Model Initialization: Learnable AdEx parameters
- Forward Pass: Spike accumulation and memory integration
- Loss Computation: Cross-entropy + L1 spike penalty
- Backward Pass: Surrogate gradient updates
Research Impact
STAC V1 demonstrated several key innovations:
- β
First successful surrogate gradient training of spiking transformers
- β
Learnable neuromorphic dynamics with AdEx neurons
- β
Hyperdimensional memory integration in spiking networks
- β
Energy-efficient spike regularization techniques
Usage
# Open the Jupyter notebook
jupyter notebook stac-v1/stacv1.ipynb
# Or view in VS Code
code stac-v1/stacv1.ipynb
Evolution to STAC V2
STAC V2 evolved from V1 by:
- Shifting to conversion-based approach for practical deployment
- Extending to multi-turn conversations with Temporal Spike Processor
- Focusing on hardware compatibility for neuromorphic deployment
- Maintaining V1βs energy efficiency principles in conversion framework
Note: STAC V1 is a complete research prototype that has been validated and documented. STAC V2 builds upon these foundations with a different methodological approach focused on practical deployment.