PR #1844
openAdd SkipQuant Adapter TTT (int4 skip_gates/weights + 4-epoch TTT)
by Hetul803View on GitHub
val_bpb
1.3110
Architecture
Transformer
Optimizer
—
Artifact Size
15,999,098 bytes
Training Techniques
Quantization
int4
bits: 4
scope: skip_gates and skip_weights
Test-Time Training
score-first TTT
parameters: {"rank":1024,"learning_rate":0.025,"epochs":4,"chunk_size":4096}
Sequence Length
sequence_length
train_length: null
eval_length: 4096
Other
other
SP8192 tokenizer
parameters: null
Architecture
U-Net skip connections
Selective skip-pathway adapter/skip mechanism with quantized skip gates and skip weights
parameters: null
Novel Contributions
- Selective int4 quantization applied only to skip gates and skip weights
- Rank-1024 score-first adapter TTT
- 4-epoch test-time training with 4096-token chunks
- SP8192 tokenizer usage
- Runtime-safe, legal-size submission under the 16MB artifact cap