← Back to Regularization

residual scaling

Regularization
Used in
1 PRs
Best BPB
1.5283
Avg BPB
1.5283

Hyperparameters Across PRs

pr_numberparameters
54{"scale":"1/sqrt(K)"}