← Back to Quantization

full-run Int6 QAT with STE

Quantization
Used in
1 PRs
Best BPB
1.1489
Avg BPB
1.1489

Hyperparameters Across PRs

pr_numberbitsscope
5836all except MLP and embeddings