← Back to Architecture
bidirectional attention
ArchitectureUsed in
2 PRs
Best BPB
1.3485
Avg BPB
1.3542
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 1053 | — |
| 1403 | {"is_causal":false} |
| pr_number | parameters |
|---|---|
| 1053 | — |
| 1403 | {"is_causal":false} |