PR #881
openRecord: WaterLOO — Full-Rescore N-gram Cache with Self-Exclusion (val_bpb 0.0990)
by simon-marcusView on GitHub
val_bpb
0.0990
Architecture
Transformer
Optimizer
—
Artifact Size
~15.87 MB
Training Techniques
Evaluation
sliding window eval
parameters: null
Other
other
Full-rescore two-pass n-gram cache evaluation over the entire validation stream using a prebuilt global cache
parameters: {"ngram_orders":"2-12","full_stream_rescore":true}
other
Leave-one-out self-exclusion during pass 2 by subtracting each token's own context and context-target counts before scoring
parameters: null
other
Vectorized cache construction using np.bincount
parameters: null
other
Complementary training enabled
parameters: null
Sequence Length
sequence_length
train_length: null
eval_length: null
Novel Contributions
- Full-rescore n-gram cache evaluated over the entire validation stream
- Leave-one-out self-exclusion that removes each token's own cache contribution during rescoring
- Fast vectorized cache construction with np.bincount
- Demonstration that the full-rescore architecture remains strong even without self-inclusion