PR #946

open

Non-record: Legal Neural-Only No-TTT (8xH100) val_bpb=1.1606

by aamodbhattView on GitHub
val_bpb
1.1606
Architecture
Transformer
Optimizer
Artifact Size
13,446,760 bytes

Training Techniques

Test-Time Training
none
parameters: {"TTT_ENABLED":0}
Evaluation
score-first causal evaluation
parameters: {"NGRAM_EVAL_ENABLED":0,"NGRAM_TWO_PASS_ENABLED":0,"NGRAM_FULL_RESCORE":0,"SKIP_SLIDING_EVAL":1}

Novel Contributions

  • Neural-only submission without n-gram/two-pass cache blending
  • No test-time training
  • Tokenizer and dataset left unchanged
  • Score-first causal evaluation path preserved
  • Compliance-focused legal submission