← Back to Architecture
Hybrid ETD Transformer
ArchitectureUsed in
1 PRs
Best BPB
1.1169
Avg BPB
1.1169
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 1828 | {"encoder_layers":3,"think_layers":3,"think_passes":3,"decoder_layers":4,"d_model":512,"heads":8,"kv_heads":4} |