stac

STAC: Spiking Transformer Augmenting Cognition for Conversational AI

DOI License: MIT

Overview

STAC (Spiking Transformer Augmenting Cognition) is a research framework that explores two complementary approaches to spiking neural network (SNN) language modeling:

Important: This repository currently runs software-level SNN simulations only. No metrics have been collected on physical neuromorphic hardware. Energy figures reported here are theoretical projections derived from spike-count analysis, not measured hardware data.

Key Features

Quick Start

# 1. Install dependencies
pip install -r requirements.txt

# 2. Convert DistilGPT-2 to an SNN
python run_conversion.py --model_name distilgpt2 --timesteps 8 --simplified

# 3. Run a multi-turn conversation smoke test
python snn_multi_turn_conversation_test.py --mode snn --turns 3 --timesteps 8

# 4. Run the comprehensive validation suite
python test_conversational_snn.py --model_name distilgpt2 --test_all --timesteps 8

Core Components

STAC V2

Component Purpose
smollm2_converter.py Specialized converter with TemporalSpikeProcessor.
convert.py Generic ANN-to-SNN conversion pipeline.
run_conversion.py Main CLI entry point for conversions.
spikingjelly_compat.py Cross-version compatibility layer for SpikingJelly.
test_conversational_snn.py Comprehensive test suite.
snn_multi_turn_conversation_test.py Lightweight multi-turn smoke test.

STAC V1

Component Purpose
stac-v1/stacv1.ipynb End-to-end training pipeline with learnable AdEx neurons.
stac-v1/README.md V1 documentation and research notes.
stac_v1/ + run_stac_v1.py Repo-native runnable V1 pipeline demonstrating hybrid fine-tuning (frozen GPT-2 with a trained spiking and memory head).

Implementation Status

STAC V2

Completed (prototype level)

Pending or in progress

STAC V1

Completed (research prototype)

Documentation

STAC V2

STAC V1

Testing and Validation

The repository includes extensive testing for multi-turn conversational correctness:

# Test specific components
python test_conversational_snn.py --model_name distilgpt2 --test_position_boundaries
python test_conversational_snn.py --model_name distilgpt2 --test_attention_mask
python test_conversational_snn.py --model_name distilgpt2 --test_multi_turn
python test_conversational_snn.py --model_name distilgpt2 --test_energy

# Run the full suite
python test_conversational_snn.py --model_name distilgpt2 --test_all

License

This project is licensed under the MIT License. See the LICENSE file for details.