← Back to Architecture

PLE

Architecture
Used in
1 PRs
Best BPB
1.2195
Avg BPB
1.2195

Hyperparameters Across PRs

pr_numberparameters
2062{"per_layer_embed_dim":64,"per_layer_embed_init_std":0.02}