← Back to Quantization

mixed int5/int6 QAT

Quantization
Used in
5 PRs
Best BPB
1.1466
Avg BPB
1.1779

Hyperparameters Across PRs

pr_numberbitsscope
351MLP weights and attention weights
352MLP weights int5, attention weights int6, embeddings fp16
421MLP int5, attention int6, embeddings int8
6945MLP and attention
822MLP in int5, attention in int6