← Back to Regularization

dropout

Regularization
Used in
8 PRs
Best BPB
1.0824
Avg BPB
1.4126

Hyperparameters Across PRs

pr_numberparameters
340{"rate":0.1,"scope":"attention and MLP blocks"}
345{"loop_dropout":true}
820{"rate":0}
1021{"rates":[0.3,0.05]}
1491{"rate":0}
1520{"type":"Norm-PCT-Dropout","top_l2_norm_row_fraction":0.01,"target":"FFN intermediate activations"}
1520{"type":"skip gates","description":"sigmoid-gated U-Net skip connections"}
1650