PR #285

open

Add non-record local A100 TTT eval-stride0 submission

by DanishjeetSinghView on GitHub

val_bpb

1.3510

Architecture

—

Optimizer

—

Artifact Size

11,876,675 bytes

Training Techniques

Quantization

int8

bits: 8

scope: model weights / submission artifact

Evaluation

stride-based eval

parameters: {"stride":0}

Test-Time Training

LoRA TTT

parameters: null

Compression

zlib

level: null

Other

other

Training capped by wall-clock time on a local 1xA100 run

parameters: {"max_wallclock_seconds":600,"hardware":"1x NVIDIA A100-SXM4-40GB","train_shards":80}