PR #357

open

docs: add TIPS.md and resolve environment dependency issues (#280, #82, #43)

by adityagupta26View on GitHub
val_bpb
1.1928
Architecture
Transformer
Optimizer
Adam
Artifact Size

Training Techniques

Test-Time Training
LoRA TTT
parameters: {"rank":8,"learning_rate":0.01,"betas":[0.9,0.95]}
Evaluation
sliding window eval
parameters: {"chunk_size":256,"eval_seq_len":1024,"batch_size":64}
Architecture
weight tying
Suggested weight-sharing strategy to stay within parameter constraints.
parameters: null
Other
other
Document-aware evaluation by isolating each validation document and resetting LoRA parameters between documents to avoid leakage across sequences.
parameters: null

Novel Contributions

  • Added TIPS.md documentation with actionable advice for participants
  • Document-aware LoRA test-time training during evaluation
  • Per-document adaptation with LoRA adapters reset between documents
  • Sliding-window / strided evaluation over overlapping chunks
  • Clarified that tokenizer size does not count toward the 16MB artifact limit
  • Added flash-attn to requirements.txt to fix the RunPod environment dependency