PR #737

open

[Non Record] Online Curriculum Learning

by SPTholeView on GitHub

val_bpb

1.3557

Architecture

—

Optimizer

—

Artifact Size

15.25MB

Training Techniques

Other

other

Online sequence-level curriculum learning that scores sequences by unigram entropy and filters/selects sequences within each batch according to a V-shaped difficulty schedule over training progress.

parameters: {"difficulty_metric":"unigram entropy","selection":"load 2x sequences per batch and select the half centered around target difficulty percentile","schedule_shape":"V-shaped","aligned_with":["LR warmdown","SWA phases"]}

LR Schedule

warmdown

parameters: {"warmdown_fraction":0.45,"total_steps_symbol":"T"}

Weight Averaging

SWA

parameters: {"start_frac":0.2,"every_steps":50}

Compression

zstd

level: null

Novel Contributions

Online sequence-level curriculum learning based on unigram entropy
V-shaped difficulty schedule that shifts from easy to hard and back to easy during training
Batch-local selection by oversampling 2x and choosing sequences around a target entropy percentile
Curriculum aligned with LR warmdown and SWA phases
Observation that runtime curriculum filtering adds overhead and can hurt overall performance at this scale