PR #1832

open

RunPod SP8192 QK-Gain 5.25 seed 42 reproduction evidence

by sricursionView on GitHub
val_bpb
1.0992
Architecture
Transformer
Optimizer
Artifact Size
15,992,604 bytes

Training Techniques

Architecture
depth recurrence
Uses the public SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT record script as the reproduction target.
parameters: {"layers":3}
parallel residuals
Parallel residual pathway used in the referenced record script.
parameters: null
Test-Time Training
TTT
parameters: {"enabled":true,"learning_rate":0.005,"epochs":3}
Legal TTT
parameters: null
Evaluation
quantized chunked eval
parameters: null
sliding window eval
parameters: null
Compression
lzma
level: null

Novel Contributions

  • Non-record reproduction/evidence bundle for a public SP8192 QK-Gain 5.25 run
  • Validated an 8xH100 RunPod training pass under the 600 second limit
  • Produced an under-16MB compressed artifact after installing missing Brotli dependency
  • Included logs and verification files for reproducibility and auditability