val_bpb
1.5248
Architecture
—
Optimizer
—
Artifact Size
13,144,462 bytes
Training Techniques
Quantization
mixed int6/int8
bits: 6
scope: attn, mlp
Sequence Length
sequence_length
train_length: 1024
eval_length: null
Architecture
MLP width multiplier
Model width multiplier for MLP layers
parameters: {"MLP_MULT":2}
Compression
zlib
level: null
Novel Contributions
- Introduced INT6_CATS environment variable to control mixed int6 quantization policy at export time
- Performed auto-research policy sweep on mixed quantization categories under a fixed 10-minute budget on 1xH100 hardware
- Demonstrated that aggressive int6 export policies preserve size but create a large quantization gap, providing useful negative results
- Provided structured negative-result submission to prune dead-end search directions before expensive multi-GPU attempts