← Back to Architecture

linear attention

Architecture
Used in
1 PRs
Best BPB
1.2008
Avg BPB
1.2008

Hyperparameters Across PRs

pr_numberparameters
1863