← Back to Test-Time Training
Discriminative per-block pre-quant TTT
Test-Time TrainingUsed in
1 PRs
Best BPB
1.0050
Avg BPB
1.0050
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 1372 | {"graduated_lr":"0.3x->1.0x","layer_groups":10} |