PR #643
openNon-record: Mac mini M4 16GB, no H100s, still golfing (val_bpb=1.5672)
by frido22View on GitHub
val_bpb
1.5672
Architecture
—
Optimizer
—
Artifact Size
15,962,372 bytes
Training Techniques
Quantization
int8
bits: 8
scope: all
Weight Averaging
EMA
parameters: {"scope":"projection matrices","timing":"late EMA after first quant-aware roundtrip"}
Compression
zlib
level: null
Novel Contributions
- Use of late EMA over only the projection matrices that remain int8-quantized after the first quant-aware roundtrip
- Reapplication of the exact final quantization roundtrip before saving to improve final compressed artifact score
- Submission as a hardware-specific non-record entry on Apple Silicon Mac mini M4 16GB without 8xH100 GPUs