← Back to Optimizer
NeoMuon
OptimizerUsed in
2 PRs
Best BPB
1.1239
Avg BPB
1.1404
Hyperparameters Across PRs
| pr_number | weight_decay | momentum | other_params |
|---|---|---|---|
| 640 | 0 | 0.95 | {"backend_steps":3,"momentum_warmup_start":0.85,"momentum_warmup_steps":500,"adam_lr":0.05,"adam_wd":0.05,"matrix_lr":0.04,"scalar_lr":0.02,"tied_embed_lr":0.02} |
| 641 | 0 | 0.95 | {"muon_backend_steps":3,"muon_momentum_warmup_start":0.85,"muon_momentum_warmup_steps":500} |