PR #990

open

ClownCar: Frugendorff compression baseline + canonical DeltaNet integration

by newjordanView on GitHub
val_bpb
0.7614
Architecture
Transformer
Optimizer
Artifact Size
9.06MB

Training Techniques

Architecture
depth recurrence
Uses one shared crawler block executed repeatedly across loops, with unique flat encoder/decoder layers around it.
parameters: {"loops":4,"flat_layers":4}
DeltaNet
Canonical DeltaNet integration via CanonicalDeltaNet and chunk_delta_rule.
parameters: {"heads":4}
weight tying
Shares crawler block weights across repeated loop executions.
parameters: null
Quantization
int6
bits: 6
scope: all
Compression
zstd
level: null
Evaluation
sliding window eval
parameters: null
LR Schedule
warmdown
parameters: {"warmdown_iters":2000}
Other
other
Int8 crawler quantization mode used to improve quantization resilience.
parameters: {"env_var":"CRAWLER_QUANT_INT8=1"}

Novel Contributions

  • Frugendorff (F-Wing) crawler baseline with a shared recurrent block repeated across loops
  • Loop-specific instruction perturbations recomputed from the current hidden state each loop
  • Canonical DeltaNet integration using chunk_delta_rule
  • Empirical signal analysis separating width effects from weight sharing effects
  • Int8 crawler quantization mode to mitigate post-processing degradation
  • Confirmed sub-16MB submission with ~9.06MB artifact size