PR #1156

open

Record: EGGROLL v2 — val_bpb 1.1161 (3-seed mean, std 0.0001)

by haikosysView on GitHub
val_bpb
1.1161
Architecture
Transformer
Optimizer
Artifact Size
~15.3 MB

Training Techniques

Quantization
GPTQ
bits: 6
scope: weights
Evaluation
sliding window eval
parameters: null
Test-Time Training
score-first TTT
parameters: null

Novel Contributions

  • EGGROLL (Antithetic Ternary Bin Search)
  • Post-GPTQ quantization refinement that optimizes INT6 bin assignments against BPB loss during evaluation budget
  • Antithetic +1/-1 bin search on randomly selected quantized weight indices
  • Adds missing eval_val_sliding_ttt call to the evaluation pipeline