PR #984
opensubmission 2026-03-27_PhaseCoherenceGatedGradients PIC-GID + ParallelMuon
by jzgdevView on GitHub
val_bpb
1.3178
Architecture
Transformer
Optimizer
Muon
Artifact Size
—
Training Techniques
Optimizer
Muon
weight_decay: null
momentum: null
other_params: {"Adam split":true,"parallel":true}
Quantization
int8
bits: 8
scope: model weights
Compression
zlib
level: null
Other
other
Phase-induced coherence-gated gradient descent (PIC-GD) using paired real/imag latent channels and target embeddings to compute a detached coherence-based gradient gate.
parameters: {"beta":2,"min_gate":0.05,"eps":0.000001,"token_stride":32,"enabled":true}
other
Tokenizer-agnostic val_bpb evaluation and int8 + zlib roundtrip export path.
parameters: null
Novel Contributions
- Phase-induced coherence-gated gradient descent (PIC-GD)
- Pseudo-complex latent pairing of adjacent hidden channels as real and imaginary parts
- Coherence-based detached gradient gating
- Muon + Adam optimizer split with parallel Muon mention
- Tokenizer-agnostic val_bpb evaluation
- Int8 plus zlib roundtrip export path