← Back to Quantization

STE QAT

Quantization
Used in
106 PRs
Best BPB
0.1653
Avg BPB
1.1984

Submissions

PR #37by khasinski
1.2012
PR #63by yahya010RECORD
1.1598
PR #69by TevBenji
1.1708
PR #89by vmfunc
1.1622
PR #107by m0at
1.1648
PR #108by kellyvv
1.4370
PR #116by abhishekgahlot2
1.1666
PR #120by andrewgcodes
0.9588
PR #123by saikrishnarallabandi
1.1642
PR #128by rsavitt
1.1594
PR #131by Billy1900
1.2701
PR #137by abhishekgahlot2
1.1666
PR #139by ksang123
1.2029
PR #150by yahya010
1.1478
PR #170by baudrillardsgh0st
1.1669
PR #185by dttdrv
1.3043
PR #190by newjordan
1.1725
PR #192by baudrillardsgh0st
1.1502
PR #194by baudrillardsgh0st
1.1480
PR #200by khasinski
1.2012
PR #225by dibdabo
1.2089
PR #232by kellyvv
1.4370
PR #238by kellyvv
1.5164
PR #273by dentity007
1.1575
PR #295by gowtham0992
1.1477
PR #295by gowtham0992
1.1477
PR #297by davidpuertolas
1.1629
PR #301by lookin-zz
1.1807
PR #304by Bortlesboat
1.4245
PR #304by Bortlesboat
1.4245
PR #306by xuafeng
1.1448
PR #306by xuafeng
1.1448
PR #324by crony-io
1.1702
PR #324by crony-io
1.1702
PR #326by crony-io
1.2890
PR #344by aryanbhosale
1.1330
PR #348by EthanYangTW
1.1444
PR #348by EthanYangTW
1.1444
PR #358by adityagupta26
1.1400
PR #359by tmustier
1.1345
PR #360by MultiFe22
1.1426
PR #360by MultiFe22
1.1426
PR #372by HyperPotatoNeo
1.1361
PR #372by HyperPotatoNeo
1.1361
PR #374by unnirRECORD
1.1246
PR #383by joelnishanth
1.1320
PR #385by dentity007
1.1488
PR #389by trasnake87
1.1466
PR #401by newjordan
1.1243
PR #433by Robby955
1.3441
PR #440by Ashutosh3142857
1.2219
PR #450by zachgoldfine44
1.1466
PR #454by nalediym
1.2055
PR #455by kasimte
1.1299
PR #531by pragnyanramtha
1.1324
PR #559by Parswanadh
1.5348
PR #573by Sarimsaljook
1.0523
PR #575by k-oconnor
1.1750
PR #667by suchitj2702
1.1352
PR #670by abaybektursun
1.1171
PR #695by 0xNoramiya
1.1360
PR #696by gravelBridge
1.2622
PR #709by StolbaJ
1.1478
PR #709by StolbaJ
1.1478
PR #710by Dhruba531
1.1240
PR #754by aryanbhosale
1.1253
PR #760by erikqu
1.2185
PR #805by zeytx
1.1807
PR #816by jimliu741523
1.1194
PR #842by JUSTSUJAY
1.3380
PR #892by robbiebusinessacc
1.1428
PR #915by anthony-maio
0.9642
PR #918by haikosys
0.1653
PR #927by Tonyy1977
1.1696
PR #929by andreanjos
1.1653
PR #979by 0xadvait
1.1387
PR #989by alexanderaperry-arch
1.1402
PR #1032by wfproc
1.3631
PR #1045by Hilo-Hilo
1.1509
PR #1057by Programmerryoki
1.2201
PR #1067by dheeren-tejani
1.4242
PR #1068by LappyG
1.1510
PR #1070by manfromnowhere143
1.1190
PR #1077by malc3om
1.1130
PR #1087by Dhenenjay
1.1407
PR #1087by Dhenenjay
1.1407
PR #1154by LucasErcolano
1.7757
PR #1202by VirajDeshwal
1.1412
PR #1227by himanshudongre
1.4841
PR #1228by meinlebenswerk
1.1527
PR #1284by tyrel-beede
1.1207
PR #1290by aryanbhosale
1.1104
PR #1357by mollahasani
1.2200
PR #1385by korentomas
1.4465
PR #1388by CiprianFlorin-Ifrim
1.5390
PR #1417by BruhTheMomentum
1.3039
PR #1467by PhamPhuHoa-23
1.1056
PR #1484by AlirezaAlampour
1.6656
PR #1486by AlirezaAlampour
1.6656
PR #1509by Lumi-node
1.1962
PR #1512by Itssshikhar
1.1117
PR #1559by adityasasidhar
1.2498
PR #1582by He-Wenhao
1.3428
PR #1602by SPThole
1.0744
PR #1621by mrbese
1.1531
PR #1640by thestbobo
1.1412

Hyperparameters Across PRs

pr_numberbitsscope
376all
636all 2D block weights
696block weights
896per-row block weights
107post-training quantization-aware training
1086all
1166MLP and attention weights; fp16 passthrough for tied embedding and small/control tensors
1206transformer blocks
1236weights
1286weights
1316transformer block weights
1376MLP and attention weights; fp16 passthrough for tied embedding
1392all linear layers (attention and MLP); ternary {-1, 0, 1} weights
1506all
1706all weights
1858model weights
1906all weight matrices except embeddings
1926all
1946all weights with fp16 tied embeddings
2006all
2256large matrices / model weights
2326all
238all
2736all
2955MLP
2956attention
2976MLP and attention weight matrices / full model quantized artifact
3016all weights
3045MLP layers
3046attention layers
3065MLP
3066attention
3245MLPs
3246Attention
3265MLPs and Attention
344final 15% of training
3485MLP
3486attention
3588all
3596all
3605MLP
3606attention
3726attention weights
3725MLP weights
3746MLP + attention weights
3836MLP + attention weights
3856all
3895final ~5% of training
4016MLP + attention weights
4336all
4404MLP
4506all
4546all
4556MLP and attention weights; int8 for embeddings
5316weights during backward pass when LR < 15% peak
5591MLP
5736late QAT when LR scale < 0.15
5758embeddings
6676all bank parameters
670all
6956MLP and attention weights
6966all weights
7095MLP
7096attention and bigram-proj
7106model weights
7546all weights
7602all weights
805all
8166all
8428all
8926int6
915model
918all
9276large weight matrices
9296all
9796attn/MLP weights
9896all
10326all
10456all
10576all
1067block weights
10686all large weight matrices
10706late QAT
10776mixed; MLP int5, attention int6
10875MLP
10876attention
1154structural weights
12026all
12275all
12286all
12846parameter banks
1290final 15% wallclock
13576all
13858all
1388weights and activations
1417all weights
1467all
14848forward pass
14868all weight matrices
15094all
15126all F.linear params
15598selected CastedLinear weights
15828weights
1602all
16216all
16406all linear weights