← Back to Quantization
late QAT
QuantizationUsed in
94 PRs
Best BPB
0.0939
Avg BPB
1.0551
Submissions
PR #864by aryanbhosale
0.2841PR #908by albertorkive
1.1734PR #909by sunnypatneedi
0.8609PR #915by anthony-maio
0.9642PR #921by TimPietrusky
0.0939PR #926by NandhuRajRK
0.8705PR #932by anthony-maio
1.1580PR #937by mihir-s-05
1.4457PR #941by aptsalt
1.3620PR #952by FlashyFlash3011
1.1144PR #963by sunnypatneedi
0.8609PR #965by Adam-Jacuch
1.1184PR #974by anthony-maio
1.6542PR #975by Abhishek8108
1.1216PR #999by aamodbhatt
1.1179PR #1002by SoHarshh
1.1650PR #1009by SoHarshh
1.1574PR #1015by shram86
1.2115PR #1019by abaybektursunRECORD
1.1147PR #1029by fielding
1.1520PR #1032by wfproc
1.3631PR #1039by yufengli-oai
1.1184PR #1066by adityakm24
1.1259PR #1085by adityasasidhar
1.2831PR #1086by Omrigotlieb
1.1349PR #1087by Dhenenjay
1.1407PR #1089by mikeapedia
1.1086PR #1092by teddyoweh
1.1219PR #1098by adityakm24
1.1187PR #1105by abaybektursun
1.2208PR #1112by dillon-blake
1.2252PR #1117by adityakm24
1.1187PR #1118by adityakm24
1.1187PR #1120by newjordan
1.1099PR #1126by AnirudhRahul
1.1091PR #1127by dentity007
1.1311PR #1129by EthanYangTW
1.1174PR #1130by Gusanidas
1.1140PR #1148by aamodbhatt
1.1179PR #1166by Christopher-Lee-McClendon
1.1347PR #1180by estesryan
1.0577PR #1182by adityakm24
1.1227PR #1209by andrewbaggio1
1.1064PR #1230by nestamidavaine
1.1163PR #1231by nestamidavaine
1.1163PR #1236by ibarrajo
1.1179PR #1237by ibarrajo
1.1198PR #1247by fahmitech
1.2208PR #1263by xexyz
0.9354PR #1303by anthony-maio
0.9462PR #1311by htrung1105
1.1303PR #1313by anthony-maio
0.8637PR #1313by anthony-maio
0.8637PR #1319by canivel
0.6951PR #1321by anthony-maio
0.7406PR #1322by newjordan
1.0854PR #1324by yahya010
0.8275PR #1355by mradassaad
1.1526PR #1361by jorge-asenjo
1.1220PR #1370by Christopher-Lee-McClendon
1.0030PR #1378by Rajat123456789
1.1711PR #1386by Buld1n
1.1452PR #1405by anthony-maio
1.0856PR #1414by Abhishek8108
0.7093PR #1440by Mertyandimata
1.1026PR #1452by bsisduck
0.3509PR #1454by bsisduck
0.3509PR #1473by AVINASH0052
1.1156PR #1488by ndokutovich
0.8265PR #1499by dippatel1994
1.6323PR #1501by SPThole
1.1159PR #1502by SPThole
1.1147PR #1528by xiehuanyi
1.1104PR #1562by joshkmartinez
1.0205PR #1563by joshkmartinez
1.0205PR #1564by joshkmartinez
1.0171PR #1568by yuitokyouni
1.1639PR #1617by adityasasidhar
1.2192PR #1619by AVINASH0052
1.1156PR #1622by joshkmartinez
1.0171PR #1643by mradassaad
1.1473PR #1644by mradassaad
1.1473PR #1661by anderamondarainh-stack
1.1444PR #1666by mrbese
1.1531PR #1679by ChideraIbe123
0.7625PR #1681by OE-GOD
1.0208PR #1683by yunoshev
1.1280PR #1687by resouer
1.0409PR #1696by kings-crown
1.1224PR #1698by arsenis-cmd
1.0099PR #1705by genji0306
1.0339PR #1711by aamodbhatt
1.0098PR #1712by aamodbhatt
1.0190PR #1722by deborahnelson8788726
0.6580Hyperparameters Across PRs
| pr_number | bits | scope |
|---|---|---|
| 864 | 6 | model |
| 908 | — | model |
| 909 | — | all |
| 915 | — | model |
| 921 | — | — |
| 926 | — | model |
| 932 | — | model |
| 937 | 6 | exported artifact |
| 941 | — | — |
| 952 | 6 | non-bank params |
| 963 | — | all |
| 965 | — | model |
| 974 | — | block weights |
| 975 | 6 | all |
| 999 | 6 | model |
| 1002 | 4 | training |
| 1009 | 4 | MLP + bigram |
| 1015 | — | matrix_only |
| 1019 | — | all |
| 1029 | — | model |
| 1032 | 6 | all |
| 1039 | 6 | model |
| 1066 | — | mixed int6/int8 |
| 1085 | — | final 40% of training |
| 1086 | 6 | last 15% of warmdown |
| 1087 | 5 | MLP |
| 1089 | — | quantized weights |
| 1092 | 6 | all |
| 1098 | — | all |
| 1105 | 6 | all |
| 1112 | — | all |
| 1117 | — | all |
| 1118 | — | artifact |
| 1120 | 6 | embeddings and 5 layers |
| 1126 | — | weights |
| 1127 | 6 | model weights |
| 1129 | 6 | all |
| 1130 | — | full model |
| 1148 | — | model |
| 1166 | — | all |
| 1180 | 6 | all |
| 1182 | 6 | all |
| 1209 | — | all |
| 1230 | 6 | core weights |
| 1231 | 6 | all |
| 1236 | — | all |
| 1237 | 5 | model |
| 1247 | 6 | all |
| 1263 | 6 | full model |
| 1303 | 6 | all |
| 1311 | 6 | model |
| 1313 | — | all |
| 1313 | 6 | all |
| 1319 | — | all |
| 1321 | 6 | all |
| 1322 | 6 | weights |
| 1324 | — | all |
| 1355 | — | all |
| 1361 | 6 | final warmdown phase |
| 1370 | 6 | model |
| 1378 | — | all |
| 1386 | 6 | export |
| 1405 | — | all |
| 1414 | 6 | all |
| 1440 | 6 | all |
| 1452 | — | all |
| 1454 | — | all |
| 1473 | 6 | all |
| 1488 | — | all |
| 1499 | — | all |
| 1501 | 8 | model |
| 1502 | — | all |
| 1528 | — | — |
| 1562 | — | all |
| 1563 | — | — |
| 1564 | — | all |
| 1568 | 6 | all |
| 1617 | — | final 15% of training |
| 1619 | — | all |
| 1622 | 6 | all |
| 1643 | — | block weights and embeddings |
| 1644 | — | all |
| 1661 | 6 | MLP and attention 2D weights |
| 1666 | 6 | all |
| 1679 | — | model |
| 1681 | — | — |
| 1683 | 4 | model |
| 1687 | 6 | artifact path |
| 1696 | — | full model |
| 1698 | — | all |
| 1705 | 6 | artifact path |
| 1711 | 6 | matrices and embeddings |
| 1712 | 6 | matrices and embeddings |
| 1722 | 6 | all |