PR #2021

open

Non-record: BigramHash uppercase enrichment (1xH100, val_bpb 1.3132)

val_bpb

1.3132

Architecture

Transformer

Optimizer

—

Artifact Size

15,343,280 bytes

Training Techniques

Architecture

BigramHash

Adds a learned hashed local-memory table using previous and current token ids, injected into the transformer.

parameters: {"table_size":12288,"inject_layer":1,"mode":"additive"}

Other

other

Tokenizer-derived hash-address enrichment using an uppercase flag in the hash key.

parameters: {"hash_enrich":"upper"}

Quantization

int8

bits: 8

scope: model

Compression

zlib

level: null

BigramHash sweep on a 1xH100 setup
Uppercase hash enrichment improved BigramHash performance
Demonstrated that simple tokenizer-derived hash enrichment outperformed more complex variants
Documented a non-record single-seed run with full reproduction artifacts