PR #1909
openRecord val_bpb 1.06996: Independent 3-seed reproduction of PR #1874 + TTT_LORA_RANK=192
by GodlyDonutsView on GitHub
val_bpb
1.0700
Architecture
Transformer
Optimizer
SGD
Artifact Size
~76 MB
Training Techniques
Quantization
GPTQ
bits: 6
scope: all
Architecture
SmearGate
Part of the reproduced PR #1874 stack.
parameters: null
Gated Attention
AttnOutGate / gated attention component from the reproduced stack.
parameters: {"width":36}
Test-Time Training
LoRA TTT
parameters: {"rank":192}
score-first TTT
parameters: null
Evaluation
sliding window eval
parameters: null
Compression
lzma
level: null
brotli
level: null
Regularization
weight decay
parameters: null
LR Schedule
warmdown
parameters: null
Optimizer
SGD
weight_decay: null
momentum: null
other_params: null
Novel Contributions
- Independent end-to-end 3-seed reproduction of PR #1874 on separate hardware
- Single hyperparameter change raising TTT LoRA rank from 128 to 192
- Provision of reload-ready quantized artifacts and unedited training logs
- Byte-budget compliance verification with reported headroom under the 16 MB cap
- Statistical comparison against current merged SOTA with reported significance