← Back to Architecture
depth recurrence / weight sharing
ArchitectureUsed in
2 PRs
Best BPB
1.1454
Avg BPB
1.1466
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 470 | {"layers":9} |
| 498 | {"unique_blocks":6,"loops":2,"effective_layers":12} |