← Back to Optimizer
L-BFGS
OptimizerUsed in
2 PRs
Best BPB
0.2282
Avg BPB
0.6164
Hyperparameters Across PRs
| pr_number | weight_decay | momentum | other_params |
|---|---|---|---|
| 1350 | — | — | {"max_iter":25,"history":20,"line_search":"strong_wolfe","space":"logit","warm_start":true,"delta_clamp":5,"focal_loss_last_tokens":128,"causal":true} |
| 1507 | — | — | {"history_size":10,"line_search":"strong Wolfe","steps":6} |