← Back to Architecture

Transformer layers

Architecture
Used in
1 PRs
Best BPB
1.1876
Avg BPB
1.1876

Hyperparameters Across PRs

pr_numberparameters
155{"layers":10}