PR #990
openClownCar: Frugendorff compression baseline + canonical DeltaNet integration
by newjordanView on GitHub
val_bpb
0.7614
Architecture
Transformer
Optimizer
—
Artifact Size
9.06MB
Training Techniques
Architecture
depth recurrence
Uses one shared crawler block executed repeatedly across loops, with unique flat encoder/decoder layers around it.
parameters: {"loops":4,"flat_layers":4}
DeltaNet
Canonical DeltaNet integration via CanonicalDeltaNet and chunk_delta_rule.
parameters: {"heads":4}
weight tying
Shares crawler block weights across repeated loop executions.
parameters: null
Quantization
int6
bits: 6
scope: all
Compression
zstd
level: null
Evaluation
sliding window eval
parameters: null
LR Schedule
warmdown
parameters: {"warmdown_iters":2000}
Other
other
Int8 crawler quantization mode used to improve quantization resilience.
parameters: {"env_var":"CRAWLER_QUANT_INT8=1"}
Novel Contributions
- Frugendorff (F-Wing) crawler baseline with a shared recurrent block repeated across loops
- Loop-specific instruction perturbations recomputed from the current hidden state each loop
- Canonical DeltaNet integration using chunk_delta_rule
- Empirical signal analysis separating width effects from weight sharing effects
- Int8 crawler quantization mode to mitigate post-processing degradation
- Confirmed sub-16MB submission with ~9.06MB artifact size