← Back to Architecture

U-Net encoder/decoder

Architecture
Used in
2 PRs
Best BPB
1.1239
Avg BPB
1.1404

Hyperparameters Across PRs

pr_numberparameters
640{"layers":10,"dim":768,"heads":8,"kv_heads":4,"head_dim":96,"MLP_expansion":4,"MLP_hidden":3072,"activation":"relu²","embedding_dim":254,"vocab_size":8192,"positional_encoding":"YaRN max_len=2048 ROPE_BASE=5000"}
641{"layers":15,"dim":768,"heads":8,"kv_heads":4,"head_dim":96}