PR #1878

open

val_bpb 0.85018 (3-seed mean) Raphe_II

by newjordanView on GitHub
val_bpb
0.8502
Architecture
Transformer
Optimizer
Artifact Size
15,995,307 bytes

Training Techniques

Novel Contributions

  • Three-seed mean submission
  • Raphe_II model variant