val_bpb
1.3510
Architecture
—
Optimizer
—
Artifact Size
11,876,675 bytes
Training Techniques
Quantization
int8
bits: 8
scope: model weights / submission artifact
Evaluation
stride-based eval
parameters: {"stride":0}
Test-Time Training
LoRA TTT
parameters: null
Compression
zlib
level: null
Other
other
Training capped by wall-clock time on a local 1xA100 run
parameters: {"max_wallclock_seconds":600,"hardware":"1x NVIDIA A100-SXM4-40GB","train_shards":80}
Novel Contributions
- Non-record local 1xA100 baseline submission focused on the TTT metric
- Uses standard final evaluation with EVAL_STRIDE=0
- Includes exact train_gpt.py snapshot and training log for reproducibility
- Fits within the 16MB artifact limit with int8 + zlib compression