Learn
Deep dive into the techniques that power top Parameter Golf submissions. Start from fundamentals and work your way up.
1. Quantization
Quantization Fundamentals
Reducing model size while preserving performance
5 sections
2. Architecture
Architecture Tricks
U-Net skips, BigramHash, SmearGate, and more
9 sections
3. Optimizer
The Muon Optimizer
Why Parameter Golf's best players abandoned Adam
11 sections
4. Weight Averaging
Weight Averaging
SWA, EMA, and ensemble-like approaches that cost almost nothing
12 sections
5. Compression
Compression
zstd, pruning, and artifact size optimization
6 sections
6. Test-Time Training
Test-Time Training
Adapting models at inference time with LoRA, score-first TTT, and per-document fine-tuning
7 sections
7. LR Schedule
Learning Rate Schedules
Warmdown, cosine, and schedule optimization
7 sections
8. Initialization
Initialization
OrthoInit and weight initialization strategies
6 sections
9. Regularization
Regularization
Weight decay, pruning, and overfitting prevention
7 sections
10. Evaluation
Evaluation Strategies
Sliding window eval, N-gram mixing, and scoring techniques
8 sections