← Back to Quantization

int5/int6

Quantization
Used in
3 PRs
Best BPB
1.1417
Avg BPB
1.1659

Hyperparameters Across PRs

pr_numberbitsscope
511
515MLP weights (int5), attention weights (int6, per-row scale)
547MLP matrices (int5), attention matrices (int6), embeddings (int6)