← Back to LR Schedule

late QAT activation based on LR scale threshold

LR Schedule
Used in
1 PRs
Best BPB
1.1807
Avg BPB
1.1807

Hyperparameters Across PRs

pr_numberparameters
805{"lr_scale_threshold":0.15}