← Back to Architecture

inhibitory layers

Architecture
Used in
1 PRs
Best BPB
1.0644
Avg BPB
1.0644

Hyperparameters Across PRs

pr_numberparameters
2116{"rank":22,"paths":["attention residual","MLP residual"]}