← Back to Architecture

Parameter Banking

Architecture
Used in
2 PRs
Best BPB
1.1091
Avg BPB
1.1169

Hyperparameters Across PRs

pr_numberparameters
399{"qo_bank":[22,512,512],"kv_bank":[22,256,512],"mlp_up_bank":[11,1536,512],"mlp_down_bank":[11,512,1536]}
1126