← Back to Weight Averaging

EMA + SWA

Weight Averaging
Used in
68 PRs
Best BPB
0.1154
Avg BPB
0.9669

Submissions

PR #593by abaybektursun
1.1163
PR #659by deanbrr
1.0920
PR #728by abaybektursun
1.1142
PR #734by Robby955
1.1198
PR #777by Robby955
0.9623
PR #796by Robby955
0.6567
PR #798by travispchen
0.5466
PR #841by someone114514
1.1157
PR #864by aryanbhosale
0.2841
PR #865by aryanbhosale
0.2841
PR #893by aryanbhosale
0.1310
PR #900by Robby955
0.1156
PR #909by sunnypatneedi
0.8609
PR #964by vivekvar-dl
1.3900
PR #968by dentity007
0.1154
PR #993by aerosta
0.9631
PR #999by aamodbhatt
1.1179
PR #1005by OnlyJundong
1.0853
PR #1019by abaybektursunRECORD
1.1147
PR #1033by Naazimsnh02
0.4311
PR #1037by TimPietruskyRunPod
1.1179
PR #1070by manfromnowhere143
1.1190
PR #1077by malc3om
1.1130
PR #1087by Dhenenjay
1.1407
PR #1092by teddyoweh
1.1219
PR #1130by Gusanidas
1.1140
PR #1148by aamodbhatt
1.1179
PR #1150by sahiee-dev
1.1151
PR #1166by Christopher-Lee-McClendon
1.1347
PR #1171by EthanYangTW
1.1145
PR #1179by dexhunter
1.1105
PR #1228by meinlebenswerk
1.1527
PR #1230by nestamidavaine
1.1163
PR #1231by nestamidavaine
1.1163
PR #1274by MatoTeziTanka
1.0876
PR #1290by aryanbhosale
1.1104
PR #1298by Omrigotlieb
1.1043
PR #1302by vlivashkin
1.1078
PR #1318by renqianluo
1.0095
PR #1319by canivel
0.6951
PR #1325by monisha-max
1.3868
PR #1328by renqianluo
0.6361
PR #1329by renqianluo
0.6361
PR #1366by yunoshev
1.1371
PR #1376by stukenov
0.7094
PR #1383by nirmathur
1.3151
PR #1386by Buld1n
1.1452
PR #1397by Mertyandimata
1.1047
PR #1398by Mertyandimata
1.1047
PR #1401by teerthsharma
1.1100
PR #1405by anthony-maio
1.0856
PR #1406by aamodbhatt
1.0887
PR #1431by Idan3011
1.1266
PR #1467by PhamPhuHoa-23
1.1056
PR #1507by ChideraIbe123
0.2282
PR #1600by sayujshah
1.2781
PR #1645by scottcui-georgian
1.1131
PR #1661by anderamondarainh-stack
1.1444
PR #1672by andrewbaggio1
1.0119
PR #1679by ChideraIbe123
0.7625
PR #1683by yunoshev
1.1280
PR #1687by resouer
1.0409
PR #1698by arsenis-cmd
1.0099
PR #1705by genji0306
1.0339
PR #1711by aamodbhatt
1.0098
PR #1712by aamodbhatt
1.0190
PR #1722by deborahnelson8788726
0.6580
PR #1760by BrandtChristian
1.1863

Hyperparameters Across PRs

pr_numberparameters
593{"ema_decay":0.997,"swa_every":50}
659{"ema_decay":0.997,"swa_interval":50}
728{"ema_decay":0.997,"swa_every":50}
734{"ema_decay":0.997,"swa_interval_steps":50,"blend_ratio":"50/50"}
777{"ema_decay":0.997,"swa_interval":50}
796{"ema_decay":0.997,"swa_interval":50}
798
841{"swa":"tight","ema":true}
864{"ema_decay":0.997}
865{"decay":0.997}
893{"ema_decay":0.997}
900{"decay":0.997,"swa_every":50}
909{"ema_decay":0.997,"swa_every":50}
964
968{"decay":0.997}
993
999{"ema_decay":0.997,"swa_every":50}
1005{"ema_decay":0.997,"swa_every":50}
1019{"ema_decay":0.997,"swa_every":50}
1033{"ema_decay":0.997,"swa_every":50}
1037{"ema_decay":0.997,"swa_interval_steps":50}
1070{"ema_decay":0.997}
1077{"ema_decay":0.997,"swa_interval":50,"swa_start_fraction":0.5}
1087{"ema_decay":0.997,"swa_every":50}
1092{"ema_decay":0.997,"swa_every":50}
1130{"ema_decay":0.997,"swa_every":50}
1148{"ema_decay":0.997,"swa_every":50}
1150{"decay":0.997}
1166{"ema_decay":0.997,"swa_start_step":6350}
1171{"ema_decay":0.997,"swa_interval_steps":50}
1179{"ema_decay":0.997,"swa_interval":50}
1228{"swa":"late"}
1230{"ema_decay":0.997,"swa_every":50}
1231{"ema_decay":0.997,"swa_every":50}
1274{"ema_decay":0.997,"swa_every":50}
1290{"ema_decay":0.997}
1298{"decay":0.997,"every":50}
1302{"ema_decay":0.997,"swa_every":50}
1318
1319{"ema_decay":0.997,"swa_interval":50}
1325{"ema_decay":0.997}
1328
1329
1366{"decay":0.997,"tight_averaging":true,"collect_from":"EMA state","qgrid_lambda":false}
1376{"ema_decay":0.997}
1383{"ema_decay":0.997,"swa_every":50}
1386{"decay":0.997,"start_step":0,"swa_every":50}
1397{"blend":"30/70"}
1398{"blend":"30/70"}
1401
1405{"ema_decay":0.997}
1406{"ema_decay":0.997,"swa_every":50}
1431{"ema_decay":0.997}
1467{"ema_decay":0.997,"swa_every":50}
1507
1600{"decay":0.997,"enabled":true}
1645
1661{"swa_start_scale":0.2,"swa_interval_steps":50}
1672{"ema_decay":0.997,"swa_checkpoint_range":"17-18"}
1679
1683{"ema_decay":0.997,"swa_every":200}
1687
1698{"ema_decay":0.997,"swa_every":50}
1705{"ema_decay":0.997,"swa_interval_steps":50}
1711{"ema_decay":0.997,"swa_every":50}
1712{"ema_decay":0.997,"swa_every":50}
1722{"ema_decay":0.997,"swa_interval_steps":50}
1760{"ema_decay":0.997,"swa_every":50}