← Back to Architecture
attention modifications
ArchitectureUsed in
7 PRs
Best BPB
1.0590
Avg BPB
1.2747
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 981 | — |
| 1074 | {"hyperbolic_qk_mix":0.02,"hyperbolic_radius_init":0.1} |
| 1168 | — |
| 1558 | {"qk_gain":4} |
| 2006 | — |
| 2006 | — |
| 2070 | — |