← Back to Architecture
MoE
ArchitectureUsed in
6 PRs
Best BPB
0.8335
Avg BPB
1.1266
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 480 | {"experts":2,"expert_multiplier":1.5} |
| 981 | {"moe_layers":0,"total_layers":2} |
| 1451 | — |
| 1901 | {"shared_experts":1,"specialized_experts":3,"top_k":1} |
| 2102 | {"layers":[4,5],"experts":2,"routing":"top-1","enable_at":0.3} |
| 2107 | — |