PG Field Guide
Learn
Techniques
Emerging
PRs
← PR #1845
PR #1848→
PR #1846
open
val_bpb 0.87206 (3-seed mean) Raphe
by newjordan
View on GitHub
val_bpb
0.8721
Architecture
Transformer
Optimizer
—
Artifact Size
—
Training Techniques
Evaluation
sliding window eval
parameters
:
null
Novel Contributions
Raphe submission with 3-seed mean validation bpb
Uses sliding window evaluation