← Back to Architecture
U-Net encoder/decoder
ArchitectureUsed in
2 PRs
Best BPB
1.1239
Avg BPB
1.1404
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 640 | {"layers":10,"dim":768,"heads":8,"kv_heads":4,"head_dim":96,"MLP_expansion":4,"MLP_hidden":3072,"activation":"relu²","embedding_dim":254,"vocab_size":8192,"positional_encoding":"YaRN max_len=2048 ROPE_BASE=5000"} |
| 641 | {"layers":15,"dim":768,"heads":8,"kv_heads":4,"head_dim":96} |