← Back to Evaluation

distributed cache pre-fill

Evaluation
Used in
1 PRs
Best BPB
0.6678
Avg BPB
0.6678

Hyperparameters Across PRs

pr_numberparameters
806{"multi_gpu":true,"rank":7,"prefill_tokens":54000000,"prefill_time_seconds":68}