← Back to Architecture

Value Residual Learning

Architecture
Used in
3 PRs
Best BPB
0.3212
Avg BPB
0.7904

Hyperparameters Across PRs

pr_numberparameters
733
745
850{"layers":[1,10]}