val_bpb
1.1606
Architecture
Transformer
Optimizer
—
Artifact Size
13,446,760 bytes
Training Techniques
Test-Time Training
none
parameters: {"TTT_ENABLED":0}
Evaluation
score-first causal evaluation
parameters: {"NGRAM_EVAL_ENABLED":0,"NGRAM_TWO_PASS_ENABLED":0,"NGRAM_FULL_RESCORE":0,"SKIP_SLIDING_EVAL":1}
Novel Contributions
- Neural-only submission without n-gram/two-pass cache blending
- No test-time training
- Tokenizer and dataset left unchanged
- Score-first causal evaluation path preserved
- Compliance-focused legal submission