← Back to Architecture

depth

Architecture
Used in
12 PRs
Best BPB
1.0981
Avg BPB
1.1992

Hyperparameters Across PRs

pr_numberparameters
81{"layers":10}
215{"layers":11,"encoder_layers":5,"decoder_layers":6}
447{"layers":10}
450{"layers":12}
474{"layers":12}
675{"layers":10}
858{"layers":11}
939{"layers":7}
1085{"layers":11}
1774{"layers":12}
1940{"layers":11}
2112{"layers":10,"model_dim":1024,"mlp_mult":2}