PR #1884
openExperiment: SmearGate BOS Fix + train-only logit calibration
by someone114514View on GitHub
val_bpb
1.0615
Architecture
Transformer
Optimizer
—
Artifact Size
~15.95 MB
Training Techniques
Architecture
SmearGate
SmearGate attention with a BOS document boundary fix to prevent attention bleeding across documents.
parameters: null
Quantization
GPTQ
bits: null
scope: post-training
Test-Time Training
score-first TTT
parameters: {"phases":3}
Other
other
Train-only post-GPTQ logit calibration using a fixed global temperature and coarse token-group bias buckets, frozen after fitting and applied before softmax.
parameters: {"global_temperature":true,"bias_buckets":["byte length","starts-with-space","newline","digit","punctuation","alpha/case"],"train_only":true}
Novel Contributions
- Adds an optional train-only post-GPTQ logit calibration pass
- Fits a global temperature plus coarse token-group bias buckets from training tokens only
- Applies frozen affine logit correction before softmax during quantized diagnostic eval and phased score-first TTT
- Tests whether local post-GPTQ calibration gains transfer to the stronger SmearGate BOS fix stack