PR #2125

open

add pr2069 best 8xh100 submission package

by tenet-diverView on GitHub
val_bpb
1.2349
Architecture
Transformer
Optimizer
Artifact Size
15843310 bytes

Training Techniques

Test-Time Training
TTT
parameters: {"enabled":false}
Sequence Length
sequence_length
train_length: 2097152
eval_length: null
Other
other
Post-deadline target-hardware reproduction of PR #2069 candidate on 8xH100, preserving direct rerun evidence without claiming a record.
parameters: {"candidate_id":"best4x_ttt_disabled_qk525","base_submission":"PR #2069","hardware":"8x NVIDIA H100 80GB HBM3","seed":1337}

Novel Contributions

  • Post-deadline rerun of the PR #2069 candidate on intended 8xH100 hardware
  • Preservation of direct reproduction evidence for the same candidate
  • Documentation that the rerun did not beat the leaderboard baseline
  • Packaging of a sub-16MB artifact for a non-record submission