PR #502

open

Non-record: 1xH100 auto-research int6 policy sweep

by aamodbhattView on GitHub
val_bpb
1.5248
Architecture
Optimizer
Artifact Size
13,144,462 bytes

Training Techniques

Quantization
mixed int6/int8
bits: 6
scope: attn, mlp
Sequence Length
sequence_length
train_length: 1024
eval_length: null
Architecture
MLP width multiplier
Model width multiplier for MLP layers
parameters: {"MLP_MULT":2}
Compression
zlib
level: null

Novel Contributions

  • Introduced INT6_CATS environment variable to control mixed int6 quantization policy at export time
  • Performed auto-research policy sweep on mixed quantization categories under a fixed 10-minute budget on 1xH100 hardware
  • Demonstrated that aggressive int6 export policies preserve size but create a large quantization gap, providing useful negative results
  • Provided structured negative-result submission to prune dead-end search directions before expensive multi-GPU attempts