PR #1996

open

Non-record: Mixed-Temperature Self-Generated GPTQ Calibration on V6

by ryankagygamestop2View on GitHub
val_bpb
1.2519
Architecture
Transformer
Optimizer
Artifact Size
13.37 MB

Training Techniques

Quantization
GPTQ
bits: 4
scope: weights
GPTQ
bits: 6
scope: embeddings
Evaluation
sliding window eval
parameters: {"stride":64,"context_length":448}
Sequence Length
sequence_length
train_length: null
eval_length: 2048
Compression
lzma
level: 9
Other
other
Autoregressive self-generated calibration with multinomial sampling instead of greedy argmax, including mixed-temperature calibration (32 sequences at T=0.5 and 32 sequences at T=1.5) with BOS-only seeding.
parameters: {"calib_sequences":64,"temperatures":[0.5,1.5],"split":[32,32],"bos_seed":true}

Novel Contributions

  • Replaced greedy argmax self-generation with multinomial sampling for GPTQ calibration
  • Introduced mixed-temperature calibration using a 50/50 split between low and high temperatures
  • Reported a validation BPB improvement of 0.0054 over a single-temperature T=0.8 baseline on the V6 stack
  • Demonstrated that calibration-distribution choice can materially affect GPTQ performance