← Back to LR Schedule

cosine decay + hold + linear warmdown

LR Schedule
Used in
1 PRs
Best BPB
0.4380
Avg BPB
0.4380

Hyperparameters Across PRs

pr_numberparameters
663{"cosine_decay_to_fraction":0.1,"cosine_decay_steps":3400,"hold_steps":[3400,5500],"linear_warmdown_to_zero":true}