PR #1083

open

Bandit: ClownCar Crawler x Cubric Ngram9 — 0.4961 BPB, 9.9mb

by newjordanView on GitHub
val_bpb
0.4961
Architecture
Hybrid
Optimizer
Artifact Size
9.21 MB

Training Techniques

Architecture
depth recurrence
ClownCar crawler with 4 flat layers plus 1 crawler repeated for 4 loops.
parameters: {"flat_layers":4,"crawler_layers":1,"loops":4}
other
Inst-dim 32 FLOW variant with DN=0 and causality-fixed configuration.
parameters: {"inst_dim":32,"flow":true,"dn":0,"causality_fixed":true}
Weight Averaging
EMA
parameters: {"start_step":4400,"decay":0.99}
Quantization
GPTQ
bits: 6
scope: model
Compression
zstd
level: null
Evaluation
sliding window eval
parameters: null
Other
other
N-gram oracle / X-WING ngram stack with shared tables, 3D Cubric 54-cell warm-start, entropy-adaptive alpha, complement alpha, and ngram evaluation order 9.
parameters: {"shared_tables":true,"cubric_warm_start_cells":54,"alpha_range":[0.2,0.75],"complement_alpha":0.5,"ngram_eval_order":9}

Novel Contributions

  • ClownCar crawler base model with repeated crawler loops
  • Custom X-WING n-gram oracle stacked on top of the base model
  • 3D Cubric 54-cell warm-start for the n-gram component
  • Entropy-adaptive alpha blending for n-gram correction
  • GPTQ-int6 plus zstd artifact packing