← Back to Architecture

MLP expansion adjustment

Architecture
Used in
1 PRs
Best BPB
1.2164
Avg BPB
1.2164

Hyperparameters Across PRs

pr_numberparameters
679{"baseline_mlp_mult":2,"bankedlinear_mlp_mult":2.6}