← Back to Quantization

FP8

Quantization
Used in
3 PRs
Best BPB
1.2064
Avg BPB
1.4151

Hyperparameters Across PRs

pr_numberbitsscope
7398all persistent state (master weights, optimizer momentum)
9038embeddings and medium matrices
13888FP parameters and scales