← Back to Regularization
LN scale depth damping
RegularizationUsed in
1 PRs
Best BPB
1.1327
Avg BPB
1.1327
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 489 | {"init_scale_rule":"1/sqrt(layer_idx+1)"} |