← Back to Quantization
mixed int5/int6/int8
QuantizationUsed in
6 PRs
Best BPB
1.1172
Avg BPB
1.1934
Submissions
Hyperparameters Across PRs
| pr_number | bits | scope |
|---|---|---|
| 272 | — | MLP matrices int5, attention matrices int6, elsewhere int8 |
| 349 | — | MLP weights int5, attention weights int6, embeddings int8/FP16 for small tensors |
| 623 | — | MLP weights (int5), Attention weights (int6), Bigram embeddings (int6), Token embeddings (int8) |
| 678 | — | MLP int5, attention int6, bigram embeddings int6, token embeddings int8 |
| 1090 | — | MLP, attention, embeddings |
| 1422 | — | MLP, attention, embeddings |