PR #1778

open

[Non record] Mercury in Retrograde - text diffusion model

by simon-marcusView on GitHub

val_bpb

1.4564

Architecture

Transformer

Optimizer

—

Artifact Size

15,677,283 bytes

Training Techniques

Quantization

int8

bits: 8

scope: final artifact

Compression

zlib

level: null

Other

other

Direct text diffusion / denoising objective with mixed continuation and infill masking

parameters: {"recipe":"mercury_hybrid35_mixsc"}

other

Progressive hybrid corruption schedule from 25% to 35%

parameters: {"start_corruption":0.25,"end_corruption":0.35}

other

Self-conditioning with commit fraction

parameters: {"commit_fraction":0.75}

other

Small clean-language prior to stabilize denoising

parameters: null

other

Parallel refinement / denoising evaluation for continuation and infill

parameters: {"tasks":["continuation","infill"]}

A diffusion-native text model submission explicitly designed around denoising rather than a mostly-autoregressive hybrid
Mixed continuation and infill masking to support any-order generation
Progressive hybrid corruption schedule with self-conditioning
Parallel refinement interface that exposes a speed-quality tradeoff
Matched benchmark showing very high throughput but poor token accuracy under the challenge budget