PR #1491

open

Non-record: Faithful mHC-lite

by wisebreadloafView on GitHub

val_bpb

1.6924

Architecture

Transformer

Optimizer

—

Artifact Size

2,290,235 bytes

Training Techniques

Architecture

Hybrid

Faithful standalone mHC-lite branch with explicit residual-stream expansion and reduction, plus hyper-connected attention and MLP branches in a compact GPT-style model.

parameters: {"layers":4,"model_dim":256,"num_heads":4,"num_kv_heads":4,"num_streams":4,"num_fracs":1}

Sequence Length

sequence_length

train_length: 256

eval_length: null

Regularization

dropout

parameters: {"rate":0}

Novel Contributions

Faithful standalone mHC-lite architecture
Residual-stream expansion and reduction
Hyper-connected attention and MLP branches
Compact GPT-style integration of MHCLite
Bundled local copy of hyper_conn/mhc_lite.py for standalone execution
Non-record methodological / negative-result submission