PR #1531

open

Add V22 Int6 fast-converging 16MB model (~8min on RTX 4090)

by mini-saramiView on GitHub
val_bpb
1.4537
Architecture
Transformer
Optimizer
Muon
Artifact Size
11.38MB

Training Techniques

Quantization
int6
bits: 6
scope: all
Architecture
V22
Custom V22 architecture with efficient parameter usage
parameters: null
Optimizer
Muon
weight_decay: null
momentum: null
other_params: {"tuned":true}
Compression
zlib
level: 9

Novel Contributions

  • INT6 quantization for aggressive compression
  • V22 architecture with efficient parameter usage
  • Fast convergence under strict compute and size constraints
  • Use of zlib compression to fit the artifact within the size limit
  • Single RTX 4090 training setup