PR #1478

open

Shallow Blue: BOS-Reset Exact Memory Probe

val_bpb
1.1995
Architecture
Transformer
Optimizer
Artifact Size
15,933,037 bytes

Training Techniques

Architecture
weight tying
Standard backbone uses tied embeddings / tied weights as part of the baseline SP-1024 model.
parameters: null
Evaluation
BOS-reset non-overlap eval
parameters: {"window":1024,"stride":1024}
stride-based eval
parameters: {"stride":1024}
Other
other
Document-local exact-memory scorer that builds causal local memory from already-scored tokens and routes scoring through a compact probe.
parameters: {"exact_causal_3gram":true,"bounded_exact_local_repeat":true,"repeat_match_length":"4-8","top_k":3,"min_support":2,"alpha":0.3}
Sequence Length
sequence_length
train_length: null
eval_length: 1024
Compression
zlib
level: null

Novel Contributions

  • Exact-memory probe layered on top of a standard 9L/512d SP-1024 backbone
  • Causal document-local 3-gram memory with bounded exact repeat memory
  • Compact two-level uplift probe that decides when to trust memory over the neural model
  • BOS-reset non-overlap evaluation regime
  • Artifact-backed alpha sweep selecting alpha=0.30
  • Improved performance concentrated on long documents