PR #1788

open

Non-record- QAT cooldown + INT4 MLP + NuMuon-lite - 1.12 BPB

by marinabarView on GitHub
val_bpb
1.1200
Architecture
Transformer
Optimizer
Artifact Size

Training Techniques

Quantization
QAT
bits: 6
scope: all
mixed int4/int6
bits: null
scope: MLP and attention
Regularization
weight decay
parameters: null
Other
other
Frobenius-norm penalty applied every 50 steps to encourage low-rank structure for better downstream compression
parameters: {"interval_steps":50,"type":"Frobenius-norm penalty"}
Optimizer
Muon
weight_decay: null
momentum: null
other_params: {"variant":"NuMuon-lite"}

Novel Contributions

  • QAT fused into the cooldown phase instead of applying GPTQ only after training
  • Mixed precision with INT4 MLP weights and INT6 attention weights
  • NuMuon-lite Frobenius-norm regularization to encourage low-rank structure and improve GPTQ+Brotli compression