← Back to Regularization

EMA weights, LN Scale

Regularization
Used in
1 PRs
Best BPB
1.1428
Avg BPB
1.1428

Hyperparameters Across PRs

pr_numberparameters
516{"ln_scale":true}