← Back to Weight Averaging

Tight SWA

Weight Averaging
Used in
16 PRs
Best BPB
0.0830
Avg BPB
0.9778

Hyperparameters Across PRs

pr_numberparameters
535{"frequency_steps":50,"scale_threshold":0.2}
543{"scale_threshold":0.2,"checkpoints_averaged":6,"checkpoint_interval":50,"quality_penalty":"zero"}
569{"frequency":"every 50 steps","condition":"when LR scale < 0.2"}
609
887
915
932
960
986
1016
1030
1043{"interval":50}
1184{"interval":50}
1247{"interval":50}
1364{"every_steps":50}
1473{"start_step":6150}