← Back to Test-Time Training
AdamW TTT
Test-Time TrainingUsed in
8 PRs
Best BPB
0.8265
Avg BPB
1.0478
Submissions
Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 430 | {"epochs":3,"learning_rate":0.001,"betas":[0.9,0.999],"frozen_layers":6} |
| 462 | {"learning_rate":0.0005,"epochs":10,"weight_decay":0} |
| 489 | {"learning_rate":0.0005,"weight_decay":0,"epochs":5} |
| 532 | {"epochs":10,"learning_rate":0.001,"grad_clip":1,"all_params_unfrozen":true} |
| 555 | {"epochs":10} |
| 1350 | {"epochs":6,"freeze_first_blocks":2} |
| 1485 | {"epochs":6,"learning_rate":0.0005,"freeze_blocks":2,"schedule":"cosine decay","pre_quant":true} |
| 1488 | {"epochs":10,"learning_rate":0.00045,"freeze_blocks":1} |