← Back to Test-Time Training

self-distillation TTT

Test-Time Training
Used in
1 PRs
Best BPB
1.1257
Avg BPB
1.1257

Hyperparameters Across PRs

pr_numberparameters
379{"temperature":2,"freeze_blocks":4,"epochs":2,"learning_rate":0.001}