← Back to Architecture
Shared-Specific Attention
ArchitectureUsed in
1 PRs
Best BPB
1.0981
Avg BPB
1.0981
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 1774 | {"shared_head_dim":16,"specific_dim":48} |