val_bpb
1.2917
Architecture
GPT
Optimizer
—
Artifact Size
16,009,531 bytes
Training Techniques
Architecture
tied embeddings
Uses tied embeddings in the GPT baseline.
parameters: null
KV head count
Uses 4 KV heads in the GPT baseline.
parameters: {"kv_heads":4}
CTM workspace bridge
Adds a small causal CTM workspace bridge.
parameters: {"slots":4,"dimensions":64}
Other
other
Routes workspace writes with novelty plus salience scoring.
parameters: {"CTM_NOVELTY_GAIN":1,"CTM_SALIENCE_GAIN":0.5}
other
Uses prediction-error-gated skip connections.
parameters: {"SKIP_GATE_MODE":"error"}
Quantization
QAT
bits: 8
scope: export-matched int8 path
Compression
zlib
level: null
Sequence Length
sequence_length
train_length: 1024
eval_length: null
Novel Contributions
- Standalone non-record snapshot of a CTM-based proxy run
- Causal CTM workspace bridge with 4 slots x 64 dimensions
- Novelty-plus-salience workspace write routing
- Prediction-error-gated skip connections
- Export-matched tail QAT aligned with the final int8 artifact path
- Packaging of train_gpt.py, train.log, README.md, and submission.json as a reproducible in-progress snapshot