PR #1478

open

Shallow Blue: BOS-Reset Exact Memory Probe

val_bpb

1.1995

Architecture

Transformer

Optimizer

—

Artifact Size

15,933,037 bytes

Training Techniques

Architecture

weight tying

Standard backbone uses tied embeddings / tied weights as part of the baseline SP-1024 model.

parameters: null

Evaluation

BOS-reset non-overlap eval

parameters: {"window":1024,"stride":1024}

stride-based eval

parameters: {"stride":1024}

Other

other

Document-local exact-memory scorer that builds causal local memory from already-scored tokens and routes scoring through a compact probe.

parameters: {"exact_causal_3gram":true,"bounded_exact_local_repeat":true,"repeat_match_length":"4-8","top_k":3,"min_support":2,"alpha":0.3}

Sequence Length

sequence_length

train_length: null

eval_length: 1024

Compression

zlib

level: null

Exact-memory probe layered on top of a standard 9L/512d SP-1024 backbone
Causal document-local 3-gram memory with bounded exact repeat memory
Compact two-level uplift probe that decides when to trust memory over the neural model
BOS-reset non-overlap evaluation regime
Artifact-backed alpha sweep selecting alpha=0.30
Improved performance concentrated on long documents