PR #1846

open

val_bpb 0.87206 (3-seed mean) Raphe

by newjordanView on GitHub
val_bpb
0.8721
Architecture
Transformer
Optimizer
Artifact Size

Training Techniques

Evaluation
sliding window eval
parameters: null

Novel Contributions

  • Raphe submission with 3-seed mean validation bpb
  • Uses sliding window evaluation