PR #944
openRecord: Compliance-First Packed Causal Memory + Dirichlet Mixing — val_bpb 0.01654407 (3-seed mean)
by aamodbhattView on GitHub
val_bpb
0.0165
Architecture
Transformer
Optimizer
—
Artifact Size
13,810,840 bytes
Training Techniques
Architecture
BigramHash
Packed causal n-gram memory path built from training shards and loaded at eval start; multi-order hashed n-gram tables used for causal scoring.
parameters: null
Other
other
Dirichlet-normalized multi-order mixing over n-gram orders with count-confidence gating.
parameters: null
other
Optional packed phrase-suffix expert blended after the n-gram posterior with confidence throttling.
parameters: null
Novel Contributions
- Packed causal n-gram memory path built from training shards and loaded at eval start
- Dirichlet-normalized multi-order mixing with count-confidence gating
- Optional packed phrase-suffix expert with confidence throttling
- Compliance-first score-first causal evaluation stack