val_bpb
1.4564
Architecture
Transformer
Optimizer
—
Artifact Size
15,677,283 bytes
Training Techniques
Quantization
int8
bits: 8
scope: final artifact
Compression
zlib
level: null
Other
other
Direct text diffusion / denoising objective with mixed continuation and infill masking
parameters: {"recipe":"mercury_hybrid35_mixsc"}
other
Progressive hybrid corruption schedule from 25% to 35%
parameters: {"start_corruption":0.25,"end_corruption":0.35}
other
Self-conditioning with commit fraction
parameters: {"commit_fraction":0.75}
other
Small clean-language prior to stabilize denoising
parameters: null
other
Parallel refinement / denoising evaluation for continuation and infill
parameters: {"tasks":["continuation","infill"]}
Novel Contributions
- A diffusion-native text model submission explicitly designed around denoising rather than a mostly-autoregressive hybrid
- Mixed continuation and infill masking to support any-order generation
- Progressive hybrid corruption schedule with self-conditioning
- Parallel refinement interface that exposes a speed-quality tradeoff
- Matched benchmark showing very high throughput but poor token accuracy under the challenge budget