← Back to Architecture
SwiGLU MLP
ArchitectureUsed in
5 PRs
Best BPB
1.1558
Avg BPB
1.2220
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 131 | {"hidden":1024} |
| 163 | {"layers":7,"dim":576,"mlp_mult":2} |
| 391 | {"hidden_size":1280} |
| 395 | {"hidden_size":1280} |
| 507 | {"expansion_factor":3} |