← Back to Architecture
MLP
ArchitectureUsed in
19 PRs
Best BPB
0.9650
Avg BPB
1.1830
Submissions
PR #301by lookin-zz
1.1807PR #497by THUQiXuan
1.3162PR #602by ReNothingg
1.1422PR #606by EthanYangTW
1.1162PR #628by Christopher-Lee-McClendon
1.0983PR #636by NewyorkDev
1.1234PR #641by CiprianFlorin-Ifrim
1.1239PR #653by demirelo
1.1552PR #873by gowtham0992
1.0467PR #939by brian386
1.2519PR #1078by chinmaypatwardhan-ops
1.3193PR #1246by deborahnelson8788726
0.9650PR #1354by samacqua
1.1092PR #1513by ikermoel
1.8658PR #1538by davie2009kh
1.1180PR #1560by dexhunter
1.0741PR #1625by ChideraIbe123
1.1104PR #1646by sergeevii123
1.0909PR #1654by IshiPareek
1.2699Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 301 | {"hidden_size":1472} |
| 497 | {"MLP_HIDDEN":992} |
| 602 | {"layers":10} |
| 606 | {"scale":3.5,"activation":"relu²"} |
| 628 | {"expansion_factor":3,"hidden_dim":1536,"activation":"ReLU²"} |
| 636 | {"expansion":3} |
| 641 | {"expansion_factor":4,"hidden_dim":3072,"activation":"relu²"} |
| 653 | {"count":3} |
| 873 | {"blocks":3} |
| 939 | {"expansion":1.875} |
| 1078 | {"rank":16} |
| 1246 | {"expansion":4} |
| 1354 | {"activation":"LeakyReLU","activation_power":2} |
| 1513 | {"parameters":524288} |
| 1538 | {"experts":4,"top_k":2} |
| 1560 | — |
| 1625 | {"blocks":"L5-L10"} |
| 1646 | {"multiplier":4.8} |
| 1654 | {"layers":2} |