PR #1904

open

Non-record submission: Activation-Space CSA on SP1024 (8xH100)

by lucribasView on GitHub
val_bpb
1.2580
Architecture
Transformer
Optimizer
Artifact Size
15,829,207 bytes

Training Techniques

Test-Time Training
score-first TTT
parameters: {"adapter":"ACSA","reset_between_documents":true}
Other
other
Activation-Space Compressed-Sensing Adapters (ACSA) that adapt hidden activations with a sparse code instead of LoRA weight deltas.
parameters: {"targets":["postblock"],"optional_targets":["prehead"]}
Compression
zlib
level: null
Sequence Length
sequence_length
train_length: 1024
eval_length: 1024
Quantization
int8
bits: 8
scope: model artifact

Novel Contributions

  • Activation-Space Compressed-Sensing Adapters (ACSA) as an alternative to LoRA-based evaluation-time adaptation
  • Sparse activation-space adaptation using a structured sensing map with sign flips, permutation, and FWHT
  • Preservation of score-before-update evaluation protocol with adapter state reset between documents
  • Non-record SP1024 submission demonstrating improvement over the quantized no-ACSA roundtrip while staying under the 16 MB artifact cap