PR #283

open

Tier 6: PPM-C eval-time context mixer (standalone + neural mixing)

by Cwarren15-AView on GitHub

val_bpb

1.2244

Architecture

—

Optimizer

—

Artifact Size

—

Training Techniques

Evaluation

eval-time probability blending / context mixing

parameters: {"standalone_ppm_order":2,"fixed_alpha_neural_share":0.95,"fixed_alpha_ppm_share":0.05,"cumulative_alpha_neural_share":0.85,"cumulative_alpha_ppm_share":0.15}

Other

other

Standalone classical PPM-C order-2 context mixer used at evaluation time to estimate token probabilities.

parameters: {"order":2,"zero_learned_parameters":true,"zero_artifact_size_cost":true}

other

Neural model probabilities blended with PPM probabilities using a fixed-alpha mixture.

parameters: {"alpha":0.95,"mode":"per-doc"}

other

Confidence-gated adaptive blending variant explored for per-token mixture weighting.

parameters: null

Novel Contributions

Classical PPM-C context mixer for eval-time probability blending with the neural model
Standalone PPM-C order-2 evaluator
Fixed-alpha neural/PPM mixture that improves BPB by about 0.015
Confidence-gated per-token adaptive blending variant
Zero learned parameters and no artifact size cost