PR #1417

open

Mixed INT5/INT6 QAT from step 1 (1.3039 bpb)

by BruhTheMomentumView on GitHub
val_bpb
1.3039
Architecture
Transformer
Optimizer
Muon
Artifact Size
15.4MB

Training Techniques

Quantization
mixed int5/int6
bits: 5
scope: MLP weights
mixed int5/int6
bits: 6
scope: attention weights
QAT
bits: null
scope: all weights
STE QAT
bits: null
scope: all weights
Compression
zstd
level: 22
Architecture
GQA
Uses grouped query attention with fewer KV heads than attention heads.
parameters: {"heads":8,"kv_heads":4}
U-Net skip connections
Adds U-Net style skip connections to the model.
parameters: null
weight tying
Ties input and output embeddings.
parameters: null
MLP3x
Uses a 3x MLP expansion.
parameters: {"expansion":3}
Optimizer
Muon
weight_decay: null
momentum: null
other_params: {"Adam_for":"embeddings/scalars"}
Regularization
weight decay
parameters: {"value":0.04}

Novel Contributions

  • Mixed INT5/INT6 quantization with INT5 for MLP weights and INT6 for attention weights
  • Quantization-aware training from step 1 using fake-quantized forward passes and STE
  • Entropy-aware compression perspective showing QAT reduces weight entropy and improves compressibility
  • Demonstrated that early QAT substantially outperforms late QAT for post-quantization quality