← Back to Architecture

encoder-decoder depth split

Architecture
Used in
1 PRs
Best BPB
1.2374
Avg BPB
1.2374

Hyperparameters Across PRs

pr_numberparameters
391{"layers":6,"encoder_layers":3,"decoder_layers":3}