PR #2083

open

Record: SP8192 CaseOps v13 PPM tuned gate — fresh 3-seed mean 0.94175270

by NewyorkDevView on GitHub
val_bpb
0.9418
Architecture
Transformer
Optimizer
Artifact Size
15,988,348 bytes

Training Techniques

Architecture
SmearGate
Attention/output gating with BOS cross-document leak masking applied in both normal forward and TTT forward paths.
parameters: null
depth recurrence
SP8192 CaseOps transformer stack with recurrence lineage referenced in the submission.
parameters: {"sequence_length":8192}
Evaluation
sliding window eval
parameters: null
Test-Time Training
score-first TTT
parameters: {"enabled":false}
Compression
lrzip
level: null
Sequence Length
sequence_length
train_length: 8192
eval_length: 8192

Novel Contributions

  • SP8192 CaseOps consolidation with sidecar-aware byte PPM evaluation
  • SmearGate BOS leak fix applied in both normal and TTT forward paths
  • Per-group lrzip compression for banked int6 tensors
  • PPM order-5 gate retune to H=0.999, L=0.18, T=0.80
  • Fresh three-seed end-to-end reruns under the 16 MB artifact cap