PR #2122

open

Non-Record: Add Novel SemanticEngine SSM submission

by KenMalloyView on GitHub
val_bpb
1.6429
Architecture
Hybrid
Optimizer
Muon
Artifact Size

Training Techniques

Architecture
depth recurrence
Pure SSM trunk with CareSSM diagonal recurrent blocks and live episodic memory during training and eval.
parameters: null
other
Dedicated memory GPUs for packet-serving and memory maintenance ranks that operate separately from trunk training.
parameters: {"gpu_6_packet_serving":true,"gpu_7_maintenance":true}
Optimizer
Muon
weight_decay: null
momentum: null
other_params: {"semantic_optimizer":"SemanticOptimizer","ssm_channel_coupled_momentum_beta":true}
Evaluation
prequential eval
parameters: {"score_before_write":true,"packet_online_cache":true}
Other
other
Live episodic memory active during both training and prequential evaluation, with score-first episodic writes and cache updates after scoring.
parameters: {"episodic_reads_per_eval":3348,"episodic_writes_per_eval":3348}

Novel Contributions

  • Pure SSM trunk submission for track_10min_16mb
  • Live episodic memory used during both training and legal prequential evaluation
  • Dedicated GPU packet-serving and memory-maintenance ranks
  • Score-before-write packet-online evaluation cache
  • CareSSM trunk with CRCT evidence substrate and MultiSlotOuterModel replay/eviction pipeline