PR #737

open

[Non Record] Online Curriculum Learning

by SPTholeView on GitHub
val_bpb
1.3557
Architecture
Optimizer
Artifact Size
15.25MB

Training Techniques

Other
other
Online sequence-level curriculum learning that scores sequences by unigram entropy and filters/selects sequences within each batch according to a V-shaped difficulty schedule over training progress.
parameters: {"difficulty_metric":"unigram entropy","selection":"load 2x sequences per batch and select the half centered around target difficulty percentile","schedule_shape":"V-shaped","aligned_with":["LR warmdown","SWA phases"]}
LR Schedule
warmdown
parameters: {"warmdown_fraction":0.45,"total_steps_symbol":"T"}
Weight Averaging
SWA
parameters: {"start_frac":0.2,"every_steps":50}
Compression
zstd
level: null

Novel Contributions

  • Online sequence-level curriculum learning based on unigram entropy
  • V-shaped difficulty schedule that shifts from easy to hard and back to easy during training
  • Batch-local selection by oversampling 2x and choosing sequences around a target entropy percentile
  • Curriculum aligned with LR warmdown and SWA phases
  • Observation that runtime curriculum filtering adds overhead and can hurt overall performance at this scale