← Back to Quantization

mixed int5/int6/int7

Quantization
Used in
5 PRs
Best BPB
1.0631
Avg BPB
1.1030

Hyperparameters Across PRs

pr_numberbitsscope
3325all weights with gradient-guided per-tensor allocation
422all
1126weights
1962matrix weights
2143q/proj/mlp_proj, kv/mlp_fc, tok_emb