← Back to Architecture
Gated DeltaNet hybrid
ArchitectureUsed in
1 PRs
Best BPB
1.0171
Avg BPB
1.0171
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 1564 | {"layers_pattern":"[GDN×5] → SWA → [GDN×5] → SWA_shared","tokenizer":"SP1024"} |