← Back to Architecture

attention modification

Architecture
Used in
4 PRs
Best BPB
1.0756
Avg BPB
1.2137

Hyperparameters Across PRs

pr_numberparameters
1212{"window_size":512,"layers":[2,4,6,8,10]}
1239{"curvature_range":[0.1,2]}
1530
1648{"layers":11}