val_bpb
1.0992
Architecture
Transformer
Optimizer
—
Artifact Size
15,992,604 bytes
Training Techniques
Architecture
depth recurrence
Uses the public SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT record script as the reproduction target.
parameters: {"layers":3}
parallel residuals
Parallel residual pathway used in the referenced record script.
parameters: null
Test-Time Training
TTT
parameters: {"enabled":true,"learning_rate":0.005,"epochs":3}
Legal TTT
parameters: null
Evaluation
quantized chunked eval
parameters: null
sliding window eval
parameters: null
Compression
lzma
level: null
Novel Contributions
- Non-record reproduction/evidence bundle for a public SP8192 QK-Gain 5.25 run
- Validated an 8xH100 RunPod training pass under the 600 second limit
- Produced an under-16MB compressed artifact after installing missing Brotli dependency
- Included logs and verification files for reproducibility and auditability