← Back to Regularization

dropout

Regularization
Used in
11 PRs
Best BPB
1.0270
Avg BPB
1.3214

Hyperparameters Across PRs

pr_numberparameters
340{"rate":0.1,"scope":"attention and MLP blocks"}
345{"loop_dropout":true}
820{"rate":0}
1021{"rates":[0.3,0.05]}
1491{"rate":0}
1520{"type":"Norm-PCT-Dropout","top_l2_norm_row_fraction":0.01,"target":"FFN intermediate activations"}
1520{"type":"skip gates","description":"sigmoid-gated U-Net skip connections"}
1650
1822{"type":"stochastic depth","expected_value_scaling":true}
2032{"stochastic_depth_max":0.02}
2039{"stochastic_depth_max":0.02}