PR #1778

open

[Non record] Mercury in Retrograde - text diffusion model

by simon-marcusView on GitHub
val_bpb
1.4564
Architecture
Transformer
Optimizer
Artifact Size
15,677,283 bytes

Training Techniques

Quantization
int8
bits: 8
scope: final artifact
Compression
zlib
level: null
Other
other
Direct text diffusion / denoising objective with mixed continuation and infill masking
parameters: {"recipe":"mercury_hybrid35_mixsc"}
other
Progressive hybrid corruption schedule from 25% to 35%
parameters: {"start_corruption":0.25,"end_corruption":0.35}
other
Self-conditioning with commit fraction
parameters: {"commit_fraction":0.75}
other
Small clean-language prior to stabilize denoising
parameters: null
other
Parallel refinement / denoising evaluation for continuation and infill
parameters: {"tasks":["continuation","infill"]}

Novel Contributions

  • A diffusion-native text model submission explicitly designed around denoising rather than a mostly-autoregressive hybrid
  • Mixed continuation and infill masking to support any-order generation
  • Progressive hybrid corruption schedule with self-conditioning
  • Parallel refinement interface that exposes a speed-quality tradeoff
  • Matched benchmark showing very high throughput but poor token accuracy under the challenge budget