PR #1042

open

Record: Adaptive Precision Embedding Quantization (4-seed mean val_bpb=1.1217)

by nothingLivaView on GitHub
val_bpb
1.1217
Architecture
Transformer
Optimizer
Artifact Size
15.8 MB

Training Techniques

Quantization
mixed int6/int8
bits: null
scope: embeddings

Novel Contributions

  • Adaptive precision embedding quantization based on token frequency
  • Assigning int8 to the top 100 most frequent tokens and int6 to the remaining tokens
  • Using higher precision for frequent tokens that cover 53.2% of the text