← Back to LR Schedule

Warmup-Stable-Decay cosine schedule

LR Schedule
Used in
2 PRs
Best BPB
1.2824
Avg BPB
1.2824

Hyperparameters Across PRs

pr_numberparameters
744{"warmup_fraction":0.05,"stable_fraction":0.75,"decay_fraction":0.2}
791{"warmup_fraction":0.05,"stable_fraction":0.75,"decay_fraction":0.2}