PR #782

open

Podracing III: Cubric Lite — 0.9362 BPB

by newjordanView on GitHub
val_bpb
0.9362
Architecture
11L/512d U-Net
Optimizer
Artifact Size
15.59 MB

Training Techniques

Quantization
int6
bits: 6
scope: model weights
GPTQ
bits: null
scope: training-phase calibration
Architecture
U-Net
11-layer, 512-dimensional U-Net architecture used as the base model.
parameters: {"layers":11,"dimensions":512}
Evaluation
legal score-first 7-gram backoff
parameters: {"orders":[2,3,4,5,6,7]}
Other
other
Entropy-adaptive alpha during n-gram evaluation.
parameters: null
other
Per-order adaptive alpha scaling ('Cubric Lite') that adjusts n-gram order multipliers based on beat-rate statistics from already-scored tokens.
parameters: {"update_interval_batches":32,"converged_multipliers":{"o2":0.3,"o3":0.3,"o4":0.97,"o5":2,"o6":2,"o7":2}}
Compression
zstd
level: null

Novel Contributions

  • Per-order adaptive alpha scaling ('Cubric Lite') for n-gram evaluation
  • Suppressing low-order n-grams (bigrams/trigrams) while boosting higher-order n-grams based on beat-rate statistics
  • Entropy-adaptive alpha combined with score-first legal n-gram backoff
  • GPTQ calibration performed during training phase using training data only