val_bpb
0.4961
Architecture
Hybrid
Optimizer
—
Artifact Size
9.21 MB
Training Techniques
Architecture
depth recurrence
ClownCar crawler with 4 flat layers plus 1 crawler repeated for 4 loops.
parameters: {"flat_layers":4,"crawler_layers":1,"loops":4}
other
Inst-dim 32 FLOW variant with DN=0 and causality-fixed configuration.
parameters: {"inst_dim":32,"flow":true,"dn":0,"causality_fixed":true}
Weight Averaging
EMA
parameters: {"start_step":4400,"decay":0.99}
Quantization
GPTQ
bits: 6
scope: model
Compression
zstd
level: null
Evaluation
sliding window eval
parameters: null
Other
other
N-gram oracle / X-WING ngram stack with shared tables, 3D Cubric 54-cell warm-start, entropy-adaptive alpha, complement alpha, and ngram evaluation order 9.
parameters: {"shared_tables":true,"cubric_warm_start_cells":54,"alpha_range":[0.2,0.75],"complement_alpha":0.5,"ngram_eval_order":9}
Novel Contributions
- ClownCar crawler base model with repeated crawler loops
- Custom X-WING n-gram oracle stacked on top of the base model
- 3D Cubric 54-cell warm-start for the n-gram component
- Entropy-adaptive alpha blending for n-gram correction
- GPTQ-int6 plus zstd artifact packing