← Back to LR Schedule
LR scheduling tuned for single-device run
LR ScheduleUsed in
2 PRs
Best BPB
1.4078
Avg BPB
1.4078
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 707 | {"gradient_accum_tokens":131000,"iterations":2600} |
| 712 | {"gradient_accum_tokens":131000,"iterations":2600} |