← Back to Test-Time Training
tiny eval-time SGD
Test-Time TrainingUsed in
1 PRs
Best BPB
1.2427
Avg BPB
1.2427
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 272 | {"targets":["q_gain","attn_scale","mlp_scale","resid_mix","skip_weights"]} |