← Back to Architecture

sparsemax routing

Architecture
Used in
1 PRs
Best BPB
0.4380
Avg BPB
0.4380

Hyperparameters Across PRs

pr_numberparameters
663