← Back to Architecture
GatedDeltaNet / Flash Linear Attention
ArchitectureUsed in
1 PRs
Best BPB
1.0339
Avg BPB
1.0339
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 1705 | {"layers":10,"model_dim":544} |