← Back to LR Schedule

adaptive cosine decay

LR Schedule
Used in
2 PRs
Best BPB
0.2071
Avg BPB
0.6236

Hyperparameters Across PRs

pr_numberparameters
731{"ramp_multiplier_start":1,"ramp_multiplier_end":3,"ramp_fraction":0.3}
851{"adaptive_lr":true,"adaptive_lr_max":3}