← Back to Test-Time Training

readout_only

Test-Time Training
Used in
1 PRs
Best BPB
1.0976
Avg BPB
1.0976

Hyperparameters Across PRs

pr_numberparameters
1704{"learning_rate":0.005,"epochs":3,"prefix_chunk_ratio":0.2,"prefix_epochs":4,"prefix_lr_scale":1.15,"prefix_hard_window_fraction":0.25}