PR #1042

open

Record: Adaptive Precision Embedding Quantization (4-seed mean val_bpb=1.1217)

by nothingLivaView on GitHub

val_bpb

1.1217

Architecture

Transformer

Optimizer

—

Artifact Size

15.8 MB

Training Techniques

Quantization

mixed int6/int8

bits: null

scope: embeddings

Novel Contributions

Adaptive precision embedding quantization based on token frequency
Assigning int8 to the top 100 most frequent tokens and int6 to the remaining tokens
Using higher precision for frequent tokens that cover 53.2% of the text