PR #1717

open

[4090 Reproduction] Achieve 1.1249 val_bpb (Note: 18KB over limit)

by samchill666View on GitHub
val_bpb
1.1249
Architecture
Transformer
Optimizer
Artifact Size
16018877 bytes

Training Techniques

Quantization
GPTQ
bits: 6
scope: all
Compression
brotli
level: null
Other
other
Dynamic gradient accumulation scaled by world size to fit training on a single RTX 4090 while preserving compatibility with official evaluation.
parameters: {"grad_accum_steps":"96 // world_size"}
other
Environment variable fallbacks for time and step limits, with local wallclock disabled during stress testing but official submission adhering to the 10-minute limit.
parameters: {"max_wallclock_seconds":600}

Novel Contributions

  • Reproduction of the SOTA architecture on a single RTX 4090
  • Reported validation score of 1.1249 bpb
  • 6-bit GPTQ quantization validated on consumer hardware
  • brotli compression pipeline validated on consumer hardware
  • Dynamic gradient accumulation for 24GB VRAM compatibility
  • Environment-variable-based fallback handling for time and step limits