← Back to Quantization

int6

Quantization
Used in
164 PRs
Best BPB
0.0180
Avg BPB
1.1038

Submissions

PR #64by yesbhautik
1.1250
PR #88by seanward
1.1605
PR #102by unnir
1.1618
PR #103by MatthewHRockwell
1.5000
PR #110by mr-ashish-panday
1.2244
PR #114by saml212
1.1574
PR #117by trovatochris
1.1702
PR #128by rsavitt
1.1594
PR #147by ankitmaloo
1.1631
PR #156by dexhunter
1.1602
PR #162by raahilshahRECORD
1.1458
PR #173by tamoghnokandar
1.1532
PR #178by timowhite88
1.1667
PR #179by devin-cog
1.1472
PR #182by mihir-s-05
1.1844
PR #186by mahsumaktas
1.1565
PR #191by chris-buckley
1.1598
PR #201by machdragon
1.1551
PR #204by Akasxh
1.2320
PR #208by ajkpersonal
1.1568
PR #209by JWLBOYCE
1.1624
PR #212by mrdavtan
1.1329
PR #215by JayCheng113
1.1548
PR #217by kshitizz36
1.1753
PR #218by bopmite
1.1248
PR #230by MatthewHRockwell
1.1541
PR #238by kellyvv
1.5164
PR #243by kvmukilan
1.1704
PR #246by kvmukilan
1.1704
PR #249by kvmukilan
1.1704
PR #251by kshitizz36
1.1596
PR #262by ibarrajo
1.0539
PR #275by ibarrajo
1.0539
PR #278by nicolasdickenmann
1.0365
PR #289by integrate-your-mind
1.1518
PR #290by ibarrajo
1.1354
PR #294by sseanliu
1.1645
PR #296by sseanliu
1.1645
PR #303by sseanliu
1.1436
PR #307by dennisimoo
1.1357
PR #316by SkywardSyntax
1.2035
PR #330by bopmite
1.1609
PR #333by mahsumaktas
1.1565
PR #344by aryanbhosale
1.1330
PR #362by mkenney2
1.1497
PR #371by mrdavtan
1.1401
PR #373by JoeProAI
1.1634
PR #375by charmquark1984
1.1257
PR #384by anantdgoel
1.2882
PR #394by greqone
1.1247
PR #399by abaybektursun
1.1247
PR #400by chanwoo-park-official
1.1296
PR #416by kshitizz36
1.1230
PR #418by yashverms
1.1715
PR #424by someone114514
1.1725
PR #429by AbhisekBasu1
1.1231
PR #432by jadechip
1.5295
PR #442by sjp611
1.1027
PR #448by handemanai
1.2006
PR #452by ofirkris
1.1366
PR #462by JoeProAI
1.0672
PR #465by LoquiAuris
1.1508
PR #465by LoquiAuris
1.1508
PR #481by mrdavtan
1.0970
PR #485by harsha-gouru
1.1522
PR #493by parinzee
1.1309
PR #512by MatoTeziTanka
0.9512
PR #517by lukacf
0.9789
PR #526by Christopher-Lee-McClendon
1.1425
PR #548by LoquiAuris
1.0865
PR #567by nitSubedi
1.3660
PR #568by MatoTeziTanka
0.7853
PR #596by AriaAnima
0.6430
PR #599by mkenney2
1.1828
PR #605by bigbag
0.7227
PR #614by bigbag
0.6864
PR #646by Upsalla
1.1349
PR #661by andrewbaggio1
1.1175
PR #668by Christopher-Lee-McClendon
1.0920
PR #671by keshav55
1.1807
PR #672by andrewbaggio1
1.0781
PR #685by andrewbaggio1
1.0366
PR #686by msisovic
1.1182
PR #696by gravelBridge
1.2622
PR #705by seanward
1.2151
PR #715by Asukabot0
1.0337
PR #722by magicjulio
0.5588
PR #727by Asukabot0
0.9674
PR #741by andrewbaggio1
0.9850
PR #759by markste-in
1.3092
PR #767by RichiiiTV
0.9209
PR #769by MatoTeziTanka
0.8508
PR #770by minh-stakc
0.6672
PR #773by siddhantparadox
1.1532
PR #776by agalimova
0.9258
PR #782by newjordan
0.9362
PR #793by pall23-mech
1.2500
PR #798by travispchen
0.5466
PR #831by sseanliu
1.1284
PR #841by someone114514
1.1157
PR #857by aruniyer
1.1093
PR #883by THUQiXuan
0.0308
PR #886by abaybektursun
0.3779
PR #891by robbiebusinessacc
1.1428
PR #892by robbiebusinessacc
1.1428
PR #901by Hilo-Hilo
1.1590
PR #907by resouer
0.0960
PR #909by sunnypatneedi
0.8609
PR #940by antaloaalonso
0.9581
PR #978by AnirudhRahul
1.5134
PR #990by newjordan
0.7614
PR #997by randy06122001-boop
1.4182
PR #998by asuramaya
0.5755
PR #1007by dillon-blake
1.2252
PR #1014by haimianbaobao007
1.6200
PR #1028by newjordan
0.8104
PR #1030by sofiabod
0.1130
PR #1044by greqone
1.8989
PR #1048by mrdavtan
1.1724
PR #1055by sanyalsunny111
0.9693
PR #1056by sofiabod
0.0180
PR #1071by AbhayAnandUCSD
1.1455
PR #1081by michaelwinczuk
1.1220
PR #1108by DbBested
1.1502
PR #1112by dillon-blake
1.2252
PR #1140by newjordan
1.1874
PR #1174by Okropniak
1.3069
PR #1180by estesryan
1.0577
PR #1183by akaiHuang
1.5080
PR #1185by skoustav35
0.9641
PR #1186by andrewbaggio1
0.9850
PR #1214by gersh
1.1688
PR #1227by himanshudongre
1.4841
PR #1232by Christopher-Lee-McClendon
1.0929
PR #1242by Campbellb
1.0903
PR #1243by simon-marcus
1.1230
PR #1244by monkeyKingProgrammer
1.1443
PR #1253by Okropniak
1.2326
PR #1255by akaiHuang
1.5080
PR #1282by newjordan
1.1035
PR #1307by amrayach
1.1101
PR #1320by jpfeiffe
1.1196
PR #1330by luciobaiocchi
1.4617
PR #1331by dexhunter
1.0900
PR #1349by LocalX991
1.3693
PR #1354by samacqua
1.1092
PR #1414by Abhishek8108
0.7093
PR #1418by Park-Tae-Hwan
1.4192
PR #1447by shram86
1.1834
PR #1463by tsubasagit
1.2774
PR #1473by AVINASH0052
1.1156
PR #1476by aryan-cs
1.0842
PR #1518by abaybektursun
1.0788
PR #1531by mini-sarami
1.4537
PR #1534by someone114514
1.0846
PR #1602by SPThole
1.0744
PR #1612by seekerPrice
1.5096
PR #1654by IshiPareek
1.2699
PR #1663by pablinga19
1.0862
PR #1724by Unwindology
1.1803
PR #1732by Victory963
1.0785
PR #1733by G3sparky
1.3262
PR #1740by amrayach
1.0722
PR #1741by amrayach
1.0722

Hyperparameters Across PRs

pr_numberbitsscope
646mlp, attn, tok_emb
886all large 2D weight matrices
1026MLP and attention weight matrices
1036block weights with fp16 embedding and fp16 LoRA passthrough
1106large 2D matrices; fp16 for tied embedding
1146weight matrices
1176per-row weights
1286MLP and attention weights; tied embeddings kept fp16
1476all
1566per-row weights; embeddings kept fp16
1626MLP and attention weights; fp16 passthrough for tied embeddings and last-layer key projection
1736weight matrices with per-row scaling; tied embedding and last 2 layers' c_k.weight kept in fp16
1786all
1796MLP and attention weights; embeddings kept in fp16
1826middle layers
1866per-row weights
1916all large weight matrices
2016MLP and attention weights; int8 embeddings
2046all model weights
2086artifact/model weights
2096weight bits for model weights; embeddings kept at 16 bits
2126all weights
2156MLP and attention weights
2176all
2186all
2306per-row weights; tied embeddings kept in fp16
2386all
2436all
2466all
2496all
2516all except fp16 embeddings
2626all
2756model weights
2786model weights
2896MLP and attention weights
2906all
2946model weights
2966all
3036all
3076all
3166all
3306all weights per-row
3336per-row weights
3446per-row weights
3626all
3716all
3736all
3756all
3846all
3946model artifact
3996evaluation artifact / model weights
4006mlp, attn
4166all
4186MLP and attention weight matrices
4246baseline model weights
4296all
4326MLP-only export / model weights with targeted fp16 exceptions
4426mixed
4486all weights with fp16 embedding passthrough
4526attention
4626all
4656attention
4656embeddings
4816per-row all weights
4856attention weights
4936all large weight matrices
5126all weight matrices
5176all
5266all
5486MLP and attention weights
5676
5686all weight matrices
5966all
5996all
6056all weights with FP16 passthrough for embeddings and control tensors
6146all
6466
6616all
6686per-row, including embeddings
6716attention weights
6726model weights
6856all
6866all
6966all weights
7056all
7156all
7226all
7276per-row weights
7416all
7596MLP
7676all
7696all
7706per-row
7736model weights
7766all
7826model weights
7936all
7986all
8316per-row weights
8416final artifact export
8576all
8836final artifact
8866all
8916MLP weights
8926MLP weights
9016model
9076all
9096all
9406per-row
9786all
9906all
9976block weights
9986artifact
10076all
10146all
10286model weights
10306per-row
10446all
10486all weights
10556per-row weights
10566per-row
10716per-row weights
10816all
11086all
11126all
11406final artifact
11746all
11806all
11836all
11856per-row
11866all
12146artifact weights
12276all
12326all
12426all
12436attn, mlp, embed, other floating tensors
12446all
12536all
12556all
12826naive
13076per-row export
13206per-row
13306all
13316all
13496all
13546model
14146all
14186model weights
14476AWQ
14636weights
14736all
14766artifact
15186model
15316all
15346all
16026MLP
16126model artifact
16546all
16636sliding eval artifact
17246all
17326MLP FC1
17336attention
17406all
17416model