val_bpb
1.3529
Architecture
—
Optimizer
—
Artifact Size
—
Training Techniques
Sequence Length
sequence_length
train_length: 1024
eval_length: null
Other
other
Local single-GPU baseline reproduction run of the OpenAI Parameter Golf NaiveBaseline
parameters: {"train_shards":1,"grad_accum_steps":8}
Novel Contributions
- Local single-GPU reproduction of the OpenAI Parameter Golf NaiveBaseline
- Reported best observed validation bpb of 1.3529 at step 4200
- Documented that validation improved from 4.1077 to 1.3529 and plateaued after step 4200