← Back to Architecture
hierarchical token processing
ArchitectureUsed in
1 PRs
Best BPB
0.6846
Avg BPB
0.6846
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 1121 | {"merge_factor":2,"encoder_layers":5,"decoder_layers":6,"d_model":512} |