← Back to Architecture
token-shift mixing
ArchitectureUsed in
1 PRs
Best BPB
1.2252
Avg BPB
1.2252
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 1112 | {"layers":8} |
| pr_number | parameters |
|---|---|
| 1112 | {"layers":8} |