← Back to Architecture
Transformer depth
ArchitectureUsed in
7 PRs
Best BPB
1.1550
Avg BPB
1.2327
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 60 | {"layers":10} |
| 63 | {"layers":10} |
| 166 | {"layers":10} |
| 242 | {"layers":10} |
| 793 | {"layers":10} |
| 805 | {"layers":11} |
| 830 | {"layers":11} |