PR #1042
openRecord: Adaptive Precision Embedding Quantization (4-seed mean val_bpb=1.1217)
by nothingLivaView on GitHub
val_bpb
1.1217
Architecture
Transformer
Optimizer
—
Artifact Size
15.8 MB
Training Techniques
Quantization
mixed int6/int8
bits: null
scope: embeddings
Novel Contributions
- Adaptive precision embedding quantization based on token frequency
- Assigning int8 to the top 100 most frequent tokens and int6 to the remaining tokens
- Using higher precision for frequent tokens that cover 53.2% of the text