PR #1491

open

Non-record: Faithful mHC-lite

by wisebreadloafView on GitHub
val_bpb
1.6924
Architecture
Transformer
Optimizer
Artifact Size
2,290,235 bytes

Training Techniques

Architecture
Hybrid
Faithful standalone mHC-lite branch with explicit residual-stream expansion and reduction, plus hyper-connected attention and MLP branches in a compact GPT-style model.
parameters: {"layers":4,"model_dim":256,"num_heads":4,"num_kv_heads":4,"num_streams":4,"num_fracs":1}
Sequence Length
sequence_length
train_length: 256
eval_length: null
Regularization
dropout
parameters: {"rate":0}

Novel Contributions

  • Faithful standalone mHC-lite architecture
  • Residual-stream expansion and reduction
  • Hyper-connected attention and MLP branches
  • Compact GPT-style integration of MHCLite
  • Bundled local copy of hyper_conn/mhc_lite.py for standalone execution
  • Non-record methodological / negative-result submission