← Back to Architecture

H-Net

Architecture
Used in
1 PRs
Best BPB
1.2070
Avg BPB
1.2070

Hyperparameters Across PRs

pr_numberparameters
1305{"layers":12,"encoder_layers":3,"main_layers":6,"decoder_layers":3,"model_dim":512,"heads":8,"kv_heads":4,"chunk_target_size":6,"vocab_size":260}