← Back to Architecture
depth/narrow transformer
ArchitectureUsed in
1 PRs
Best BPB
1.3509
Avg BPB
1.3509
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 71 | {"layers":12,"model_dim":416} |