← Back to Architecture

10-layer 4xMLP

Architecture
Used in
1 PRs
Best BPB
1.4444
Avg BPB
1.4444

Hyperparameters Across PRs

pr_numberparameters
228{"layers":10,"mlp_multiplier":4}