← Back to Architecture

10L Transformer

Architecture
Used in
1 PRs
Best BPB
1.1400
Avg BPB
1.1400

Hyperparameters Across PRs

pr_numberparameters
361{"layers":10}