← Back to Architecture
DeltaNet
ArchitectureUsed in
4 PRs
Best BPB
0.7614
Avg BPB
0.8876
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 990 | {"heads":4} |
| 1028 | {"heads":4,"short_conv":true,"loops":4,"flat_layers":4,"crawler_layers":1} |
| 1047 | {"heads":4} |
| 1286 | {"layers":8,"final_attention_layer":1,"n_embd":384} |