← Back to Quantization

GPTQ-lite

Quantization
Used in
106 PRs
Best BPB
0.0280
Avg BPB
1.0834

Submissions

PR #64by yesbhautik
1.1250
PR #175by anthony-maio
1.1229
PR #344by aryanbhosale
1.1330
PR #379by dannywillowliu-uchi
1.1257
PR #414by signalrush
1.1233
PR #429by AbhisekBasu1
1.1231
PR #445by newjordan
1.1236
PR #456by Christopher-Lee-McClendon
1.1532
PR #473by abaybektursun
1.1214
PR #478by gowtham0992
1.1268
PR #518by sofiabod
1.0622
PR #531by pragnyanramtha
1.1324
PR #534by rarce
1.1804
PR #549by abaybektursunRECORD
1.1194
PR #584by ssatia
1.1233
PR #617by ryanadamsai
1.1228
PR #625by Joeavaib
1.1194
PR #642by minh-stakc
0.8173
PR #645by FlynnCruse
1.8990
PR #657by anthony-maio
1.1234
PR #659by deanbrr
1.0920
PR #668by Christopher-Lee-McClendon
1.0920
PR #682by gthgomez
1.1233
PR #691by xexyz
1.0988
PR #710by Dhruba531
1.1240
PR #714by Upsalla
1.1187
PR #720by agalimova
1.1078
PR #726by DeepReinforce
1.1147
PR #733by stukenov
1.0278
PR #745by stukenov
1.0222
PR #754by aryanbhosale
1.1253
PR #758by hypery11
1.0465
PR #762by robinojw
0.7139
PR #763by hypery11
0.9917
PR #768by mradassaad
1.1201
PR #770by minh-stakc
0.6672
PR #771by sunnypatneedi
1.0705
PR #784by iverbovoy
1.2065
PR #786by shinegami-2002
0.8128
PR #794by jeremyschied
1.3346
PR #795by hypery11
0.8881
PR #797by armantsaturian
0.8960
PR #805by zeytx
1.1807
PR #816by jimliu741523
1.1194
PR #827by Programmerryoki
1.3999
PR #838by aryanbhosale
1.1215
PR #857by aruniyer
1.1093
PR #865by aryanbhosale
0.2841
PR #870by simon-marcus
0.0935
PR #882by IshiPareek
1.3762
PR #884by BhatiaUday
1.1448
PR #887by anthony-maio
0.9642
PR #889by anthony-maio
0.9642
PR #892by robbiebusinessacc
1.1428
PR #893by aryanbhosale
0.1310
PR #914by mkenney2
1.1873
PR #915by anthony-maio
0.9642
PR #924by THUQiXuan
0.0280
PR #926by NandhuRajRK
0.8705
PR #927by Tonyy1977
1.1696
PR #932by anthony-maio
1.1580
PR #953by dexhunter
1.0722
PR #964by vivekvar-dl
1.3900
PR #967by dexhunter
1.0450
PR #974by anthony-maio
1.6542
PR #979by 0xadvait
1.1387
PR #995by dexhunter
1.0362
PR #1005by OnlyJundong
1.0853
PR #1026by danielxmed
1.0945
PR #1033by Naazimsnh02
0.4311
PR #1043by okezue
1.1261
PR #1048by mrdavtan
1.1724
PR #1051by tejas-goyal
1.2826
PR #1057by Programmerryoki
1.2201
PR #1062by yaowubarbara
1.4508
PR #1069by manfromnowhere143
1.1190
PR #1069by manfromnowhere143
1.1190
PR #1070by manfromnowhere143
1.1190
PR #1077by malc3om
1.1130
PR #1084by AnubhavBharadwaaj
1.1185
PR #1086by Omrigotlieb
1.1349
PR #1087by Dhenenjay
1.1407
PR #1094by michaelwinczuk
0.4027
PR #1128by AnubhavBharadwaaj
1.1154
PR #1150by sahiee-dev
1.1151
PR #1202by VirajDeshwal
1.1412
PR #1230by nestamidavaine
1.1163
PR #1231by nestamidavaine
1.1163
PR #1247by fahmitech
1.2208
PR #1269by Jtss-ux
1.1194
PR #1276by BiggerDABOSS
1.1100
PR #1280by aamodbhatt
1.1156
PR #1298by Omrigotlieb
1.1043
PR #1311by htrung1105
1.1303
PR #1389by Rome-1
1.7270
PR #1406by aamodbhatt
1.0887
PR #1407by OnlyJundong
1.0960
PR #1424by OnlyJundong
1.0858
PR #1444by hypnoastic
1.3081
PR #1573by shivangbaveja
1.1464
PR #1574by KRGulaj
1.3587
PR #1579by Tonyy1977
1.1372
PR #1582by He-Wenhao
1.3428
PR #1630by KevinChunye
1.1412
PR #1709by Bananakin1
1.1470
PR #1748by elad-simbalista
1.2098

Hyperparameters Across PRs

pr_numberbitsscope
646mlp, attn, tok_emb
1756all
344per-row weights
3796all weights
4146MLP and attention weights
429all
4456all
45675% of layers
4736model weights
4786all large weights
5186all
531attention layers int6, MLP int5, rest int8 or pass-through
534all
5496all
584model weights
617all
6256
6426all
645
6576all
6596all
6686all weights including embeddings
6826large 2-D tensors / model weights
6916all
7106MLP and attention weights
7146all
7206model weights
7266model weights
7336all
7456all
7546per-row weights
7586all
7626all
7636all
7686model weights
770all
7716all
7848all
7866model
7946model weights
7956all
7976all
8056per-row weights
8166all
8276all weights
8386all weights with FP16 embedding passthrough
857all
8656model
8706all
882all
8846model weights
8876all
8896model
892block weights
8936model weights
9146all
9156model weights
9246base model
9266all
9278final artifact
9326model
9535all
9646all
9675base model
9746block weights
9796attn/MLP weights
9955model
10056all
10266model
10336all
10436all
10486well-conditioned weights
10516model weights
10576all
10626model weights
10696weights
10698weights
10706MLP+attn
10776per-row
10846model
1086per-row
1087all
10946all
11286all
11506all
12026weights
12306export
12316all
12476MLP + attention weights
12696all
12766all
12806all
12986all
13116all large weight matrices
13896all int6 tensors
14066all
14076model + code
14246model weights
14446all
15735MLP and attention weights
15746all weights; embeddings int8
15796all
15828per-row weights
16306all
17096all
17488per-row