← Back to Regularization

loss weighting

Regularization
Used in
1 PRs
Best BPB
1.1146
Avg BPB
1.1146

Hyperparameters Across PRs

pr_numberparameters
1519{"weight_by":"UTF-8 bytes per token","clamp_min":1}