← Back to Regularization

NEFTune

Regularization
Used in
1 PRs
Best BPB
1.0603
Avg BPB
1.0603

Hyperparameters Across PRs

pr_numberparameters
2163{"alpha":5,"training_only":true,"disabled_during_ttt":true}