← Back to Architecture
MLP expansion
ArchitectureUsed in
5 PRs
Best BPB
1.1355
Avg BPB
1.1880
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 579 | {"hidden_dim":2560,"activation":"relu-squared"} |
| 592 | {"expansion_factor":3,"hidden_dim":1536} |
| 966 | {"baseline":"2.00x","short_conv":"1.99x","moc":"1.93x"} |
| 1315 | {"scale_vs_baseline":2.65} |
| 1551 | {"baseline_expand":"2x","temporary_expand":"4x","effective_training_width":"8x"} |