PR #1513

open

Non-record JEPA submission: VRS (Void Rescue System)

val_bpb

1.8658

Architecture

Transformer

Optimizer

—

Artifact Size

15,980,840 bytes

Training Techniques

Architecture

JEPA-style regression transformer

Causal transformer trained to predict next-token embeddings with an MSE objective instead of direct token classification.

parameters: null

MLP

Small auxiliary rescuer decoder that maps raw regression latents v_void to corrected latents v_rescued before final decoding.

parameters: {"parameters":524288}

Regularization

weight decay

parameters: null

Compression

zlib

level: null

JEPA-style regression language model for the Parameter Golf challenge
Auxiliary rescuer decoder that corrects regression latents before token decoding
Regression-only training with MSE against target token embeddings
Three-seed stability evidence under the 10-minute 8xH100 budget
Comparison against standalone regression-only baselines