← Back to Quantization

Full GPTQ

Quantization
Used in
2 PRs
Best BPB
1.1175
Avg BPB
1.1189

Hyperparameters Across PRs

pr_numberbitsscope
5356all weights except small tensors and tok_emb.weight (fp16)
5696all large weights (MLP, attention, bigram, VE projections); int8 for embeddings