← Back to Test-Time Training
score-first AdamW TTT
Test-Time TrainingUsed in
2 PRs
Best BPB
1.1172
Avg BPB
1.1176
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 544 | {"chunk_tokens":131072,"epochs":3,"learning_rate":0.0001,"freeze_blocks":2,"stride":32} |
| 790 | {"chunk":131072,"unfrozen":"last 2 blocks plus control params","grouped_optimizer":true} |