← Back to Architecture

depth sharing / shared-depth

Architecture
Used in
1 PRs
Best BPB
1.6577
Avg BPB
1.6577

Hyperparameters Across PRs

pr_numberparameters
276{"layers":8,"physical_layers":4}