← Back to Architecture
BackoffNgramMixer
ArchitectureUsed in
2 PRs
Best BPB
0.0308
Avg BPB
0.3490
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 813 | {"orders":"2-7"} |
| 883 | {"max_order":13,"experts":13} |