← Back to Architecture

depth recurrence / weight sharing

Architecture
Used in
2 PRs
Best BPB
1.1454
Avg BPB
1.1466

Hyperparameters Across PRs

pr_numberparameters
470{"layers":9}
498{"unique_blocks":6,"loops":2,"effective_layers":12}