← Back to Test-Time Training

online logit bias

Test-Time Training
Used in
2 PRs
Best BPB
1.1248
Avg BPB
1.1429

Hyperparameters Across PRs

pr_numberparameters
218{"learning_rate":0.1,"momentum":0.9}
330{"learning_rate":0.1,"enabled":false}