← Back to Regularization
gradient clipping
RegularizationUsed in
45 PRs
Best BPB
0.7227
Avg BPB
1.2223
Submissions
PR #46by vavo
1.2697PR #63by yahya010RECORD
1.1598PR #96by saml212
1.1764PR #103by MatthewHRockwell
1.5000PR #107by m0at
1.1648PR #114by saml212
1.1574PR #151by mrdavtan
1.2045PR #173by tamoghnokandar
1.1532PR #181by manfromnowhere143
1.2194PR #191by chris-buckley
1.1598PR #196by sicauzxl
1.3825PR #212by mrdavtan
1.1329PR #238by kellyvv
1.5164PR #244by simon-marcus
1.2064PR #256by IvGolovach
1.1779PR #310by vishesh9131
1.1787PR #321by andreanjos
1.1864PR #333by mahsumaktas
1.1565PR #343by joeynyc
1.2459PR #344by aryanbhosale
1.1330PR #383by joelnishanth
1.1320PR #384by anantdgoel
1.2882PR #426by aniketio-ctrl
1.2026PR #512by MatoTeziTanka
0.9512PR #535by raahilshah
1.1204PR #536by jaksenc
1.5140PR #568by MatoTeziTanka
0.7853PR #569by gowtham0992
1.1175PR #605by bigbag
0.7227PR #633by MatoTeziTanka
1.1526PR #635by aryanbhosale
1.1330PR #668by Christopher-Lee-McClendon
1.0920PR #671by keshav55
1.1807PR #691by xexyz
1.0988PR #705by seanward
1.2151PR #856by iverbovoy
1.1454PR #858by nickferrantelive
1.2135PR #862by grim-hitman0XX
1.3036PR #939by brian386
1.2519PR #994by singhaikshitijjain
1.4315PR #1299by Ribin545
1.8184PR #1378by Rajat123456789
1.1711PR #1388by CiprianFlorin-Ifrim
1.5390PR #1391by Abhinav-Avasarala
1.4716PR #1393by Abhinav-Avasarala
1.4716Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 46 | {"norm":1} |
| 63 | {"max_norm":0.3} |
| 96 | {"norm":0.3} |
| 103 | {"norm":1} |
| 107 | {"norm":0.3} |
| 114 | {"grad_clip_norm":0.3} |
| 151 | {"norm":1} |
| 173 | {"grad_clip_norm":0.3} |
| 181 | {"grad_clip_norm":0.3} |
| 191 | {"grad_clip_norm":0.3} |
| 196 | {"grad_clip_norm":0.5} |
| 212 | {"grad_clip_norm":1} |
| 238 | {"grad_clip_norm":0.3} |
| 244 | {"grad_clip_norm":0.3} |
| 256 | {"grad_clip_norm":0.3} |
| 310 | {"grad_clip_norm":1} |
| 321 | {"norm":1} |
| 333 | {"norm":0.3} |
| 343 | — |
| 344 | {"clip_norm":0.3} |
| 383 | {"clip_norm":0.3} |
| 384 | {"norm":0.3} |
| 426 | {"grad_clip_norm":0.3} |
| 512 | {"clip_norm":0.3} |
| 535 | {"clip_value":0.3} |
| 536 | {"clip_value":1,"type":"global"} |
| 568 | {"value":0.3} |
| 569 | {"clip_value":0.3} |
| 605 | {"max_norm":1} |
| 633 | {"clip_value":0.3} |
| 635 | {"clip_value":0.3} |
| 668 | {"clip_norm":0.3} |
| 671 | {"norm":0.3} |
| 691 | {"clip_norm":1} |
| 705 | {"max_norm":0.3} |
| 856 | {"grad_clip_norm":0.3} |
| 858 | {"clip_norm":0.3} |
| 862 | {"norm":0.3} |
| 939 | {"norm":1} |
| 994 | — |
| 1299 | — |
| 1378 | {"clip_norm":0.3} |
| 1388 | {"norm":1} |
| 1391 | — |
| 1393 | — |