← Back to Test-Time Training

Cascaded 2-Phase L-BFGS

Test-Time Training
Used in
1 PRs
Best BPB
1.0050
Avg BPB
1.0050

Hyperparameters Across PRs

pr_numberparameters
1372{"phase1_iters":5,"phase1_history":10,"phase2_iters":18,"phase2_history":20,"history_reset_between_phases":true}