← Back to Quantization

STE QAT

Quantization
Used in
116 PRs
Best BPB
0.1653
Avg BPB
1.2042

Submissions

PR #37by khasinski
1.2012
PR #63by yahya010RECORD
1.1598
PR #69by TevBenji
1.1708
PR #89by vmfunc
1.1622
PR #107by m0at
1.1648
PR #108by kellyvv
1.4370
PR #116by abhishekgahlot2
1.1666
PR #120by andrewgcodes
0.9588
PR #123by saikrishnarallabandi
1.1642
PR #128by rsavitt
1.1594
PR #131by Billy1900
1.2701
PR #137by abhishekgahlot2
1.1666
PR #139by ksang123
1.2029
PR #150by yahya010
1.1478
PR #170by baudrillardsgh0st
1.1669
PR #185by dttdrv
1.3043
PR #190by newjordan
1.1725
PR #192by baudrillardsgh0st
1.1502
PR #194by baudrillardsgh0st
1.1480
PR #200by khasinski
1.2012
PR #225by dibdabo
1.2089
PR #232by kellyvv
1.4370
PR #238by kellyvv
1.5164
PR #273by dentity007
1.1575
PR #295by gowtham0992
1.1477
PR #295by gowtham0992
1.1477
PR #297by davidpuertolas
1.1629
PR #301by lookin-zz
1.1807
PR #304by Bortlesboat
1.4245
PR #304by Bortlesboat
1.4245
PR #306by xuafeng
1.1448
PR #306by xuafeng
1.1448
PR #324by crony-io
1.1702
PR #324by crony-io
1.1702
PR #326by crony-io
1.2890
PR #344by aryanbhosale
1.1330
PR #348by EthanYangTW
1.1444
PR #348by EthanYangTW
1.1444
PR #358by adityagupta26
1.1400
PR #359by tmustier
1.1345
PR #360by MultiFe22
1.1426
PR #360by MultiFe22
1.1426
PR #372by HyperPotatoNeo
1.1361
PR #372by HyperPotatoNeo
1.1361
PR #374by unnirRECORD
1.1246
PR #383by joelnishanth
1.1320
PR #385by dentity007
1.1488
PR #389by trasnake87
1.1466
PR #401by newjordan
1.1243
PR #433by Robby955
1.3441
PR #440by Ashutosh3142857
1.2219
PR #450by zachgoldfine44
1.1466
PR #454by nalediym
1.2055
PR #455by kasimte
1.1299
PR #531by pragnyanramtha
1.1324
PR #559by Parswanadh
1.5348
PR #573by Sarimsaljook
1.0523
PR #575by k-oconnor
1.1750
PR #667by suchitj2702
1.1352
PR #670by abaybektursun
1.1171
PR #695by 0xNoramiya
1.1360
PR #696by gravelBridge
1.2622
PR #709by StolbaJ
1.1478
PR #709by StolbaJ
1.1478
PR #710by Dhruba531
1.1240
PR #754by aryanbhosale
1.1253
PR #760by erikqu
1.2185
PR #805by zeytx
1.1807
PR #816by jimliu741523
1.1194
PR #842by JUSTSUJAY
1.3380
PR #892by robbiebusinessacc
1.1428
PR #915by anthony-maio
0.9642
PR #918by haikosys
0.1653
PR #927by Tonyy1977
1.1696
PR #929by andreanjos
1.1653
PR #979by 0xadvait
1.1387
PR #989by alexanderaperry-arch
1.1402
PR #1032by wfproc
1.3631
PR #1045by Hilo-Hilo
1.1509
PR #1057by Programmerryoki
1.2201
PR #1067by dheeren-tejani
1.4242
PR #1068by LappyG
1.1510
PR #1070by manfromnowhere143
1.1190
PR #1077by malc3om
1.1130
PR #1087by Dhenenjay
1.1407
PR #1087by Dhenenjay
1.1407
PR #1154by LucasErcolano
1.7757
PR #1202by VirajDeshwal
1.1412
PR #1227by himanshudongre
1.4841
PR #1228by meinlebenswerk
1.1527
PR #1284by tyrel-beede
1.1207
PR #1290by aryanbhosale
1.1104
PR #1357by mollahasani
1.2200
PR #1385by korentomas
1.4465
PR #1388by CiprianFlorin-Ifrim
1.5390
PR #1417by BruhTheMomentum
1.3039
PR #1467by PhamPhuHoa-23
1.1056
PR #1484by AlirezaAlampour
1.6656
PR #1486by AlirezaAlampour
1.6656
PR #1509by Lumi-node
1.1962
PR #1512by Itssshikhar
1.1117
PR #1559by adityasasidhar
1.2498
PR #1582by He-Wenhao
1.3428
PR #1602by SPThole
1.0744
PR #1621by mrbese
1.1531
PR #1640by thestbobo
1.1412
PR #1811by peytontolbert
1.2350
PR #1821by anjing00monyet-arch
1.3825
PR #1866by deborahnelson8788726
1.5042
PR #1896by G3sparky
1.1155
PR #1930by CarlosItp
1.2600
PR #2016by sea-rod
1.2302
PR #2022by BharathSShankar
1.0720
PR #2042by FF-GardenFn
1.3641
PR #2048by kineticforge
1.3551
PR #2086by deniskurlov
1.1384

Hyperparameters Across PRs

pr_numberbitsscope
376all
636all 2D block weights
696block weights
896per-row block weights
107post-training quantization-aware training
1086all
1166MLP and attention weights; fp16 passthrough for tied embedding and small/control tensors
1206transformer blocks
1236weights
1286weights
1316transformer block weights
1376MLP and attention weights; fp16 passthrough for tied embedding
1392all linear layers (attention and MLP); ternary {-1, 0, 1} weights
1506all
1706all weights
1858model weights
1906all weight matrices except embeddings
1926all
1946all weights with fp16 tied embeddings
2006all
2256large matrices / model weights
2326all
238all
2736all
2955MLP
2956attention
2976MLP and attention weight matrices / full model quantized artifact
3016all weights
3045MLP layers
3046attention layers
3065MLP
3066attention
3245MLPs
3246Attention
3265MLPs and Attention
344final 15% of training
3485MLP
3486attention
3588all
3596all
3605MLP
3606attention
3726attention weights
3725MLP weights
3746MLP + attention weights
3836MLP + attention weights
3856all
3895final ~5% of training
4016MLP + attention weights
4336all
4404MLP
4506all
4546all
4556MLP and attention weights; int8 for embeddings
5316weights during backward pass when LR < 15% peak
5591MLP
5736late QAT when LR scale < 0.15
5758embeddings
6676all bank parameters
670all
6956MLP and attention weights
6966all weights
7095MLP
7096attention and bigram-proj
7106model weights
7546all weights
7602all weights
805all
8166all
8428all
8926int6
915model
918all
9276large weight matrices
9296all
9796attn/MLP weights
9896all
10326all
10456all
10576all
1067block weights
10686all large weight matrices
10706late QAT
10776mixed; MLP int5, attention int6
10875MLP
10876attention
1154structural weights
12026all
12275all
12286all
12846parameter banks
1290final 15% wallclock
13576all
13858all
1388weights and activations
1417all weights
1467all
14848forward pass
14868all weight matrices
15094all
15126all F.linear params
15598selected CastedLinear weights
15828weights
1602all
16216all
16406all linear weights
1811weights
1821embeddings
18662mostly weights
1896weights
19308model weights
20166attention and MLP activations
20226activations
20421HydraMLP gate_up and down weights
20481grouped linear weights
20865quinary weights