PR #1844

open

Add SkipQuant Adapter TTT (int4 skip_gates/weights + 4-epoch TTT)

by Hetul803View on GitHub
val_bpb
1.3110
Architecture
Transformer
Optimizer
Artifact Size
15,999,098 bytes

Training Techniques

Quantization
int4
bits: 4
scope: skip_gates and skip_weights
Test-Time Training
score-first TTT
parameters: {"rank":1024,"learning_rate":0.025,"epochs":4,"chunk_size":4096}
Sequence Length
sequence_length
train_length: null
eval_length: 4096
Other
other
SP8192 tokenizer
parameters: null
Architecture
U-Net skip connections
Selective skip-pathway adapter/skip mechanism with quantized skip gates and skip weights
parameters: null

Novel Contributions

  • Selective int4 quantization applied only to skip gates and skip weights
  • Rank-1024 score-first adapter TTT
  • 4-epoch test-time training with 4096-token chunks
  • SP8192 tokenizer usage
  • Runtime-safe, legal-size submission under the 16MB artifact cap