← Back to Architecture
MLP4x
ArchitectureUsed in
15 PRs
Best BPB
1.0764
Avg BPB
1.1166
Submissions
PR #498by newjordan
1.1478PR #842by JUSTSUJAY
1.3380PR #1052by demouo
1.1978PR #1260by dexhunter
1.0929PR #1287by dentity007
1.1048PR #1291by dentity007
1.0925PR #1296by aryanbhosale
1.0926PR #1326by aryanbhosale
1.0896PR #1334by aryanbhosaleRECORD
1.0897PR #1392by Its-Just-Crump
1.1020PR #1423by aryanbhosale
1.0791PR #1477by aryanbhosaleRECORD
1.0822PR #1555by andrewbaggio1
1.0764PR #1658by AVINASH0052
1.0810PR #1747by swapp1990
1.0820Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 498 | {"multiplier":4,"hidden_size":2560} |
| 842 | {"layers":5,"model_dim":512,"mlp_mult":4,"hidden":2048,"num_heads":8,"num_kv_heads":4} |
| 1052 | — |
| 1260 | {"multiplier":4} |
| 1287 | {"multiplier":4} |
| 1291 | {"multiplier":4} |
| 1296 | — |
| 1326 | — |
| 1334 | {"vocab_size":4096} |
| 1392 | {"multiplier":4} |
| 1423 | {"multiplier":4} |
| 1477 | — |
| 1555 | {"multiplier":4} |
| 1658 | {"multiplier":4,"hidden_dim":2048} |
| 1747 | {"layers":11,"intermediate_dim":2048} |