← Back to Architecture
RoPE
ArchitectureUsed in
150 PRs
Best BPB
0.0180
Avg BPB
1.1576
Submissions
PR #59by notapplica
1.2160PR #61by saml212
1.2154PR #69by TevBenji
1.1708PR #85by hydeh3r3
1.2156PR #86by aruniyerRECORD
1.1502PR #120by andrewgcodes
0.9588PR #126by Athenox14
1.7510PR #139by ksang123
1.2029PR #156by dexhunter
1.1602PR #160by ChaseWNorton
1.1623PR #163by Focus2321
1.2091PR #186by mahsumaktas
1.1565PR #198by jfprinczRECORD
1.1318PR #201by machdragon
1.1551PR #206by dexhunter
1.1507PR #215by JayCheng113
1.1548PR #216by alons23
0.8100PR #218by bopmite
1.1248PR #223by 0xjaishy
1.1326PR #231by lenguyen1807
1.2036PR #243by kvmukilan
1.1704PR #249by kvmukilan
1.1704PR #254by timowhite88
1.1303PR #265by unnir
1.1307PR #281by charmquark1984
1.1381PR #287by jfprinczRECORD
1.1271PR #290by ibarrajo
1.1354PR #298by MrINVISO
1.2271PR #305by Naazimsnh02
1.1672PR #318by sseanliu
1.1284PR #324by crony-io
1.1702PR #325by Aum08Desai
1.1462PR #326by crony-io
1.2890PR #330by bopmite
1.1609PR #333by mahsumaktas
1.1565PR #367by ksang123
1.1770PR #369by signalrush
1.1328PR #379by dannywillowliu-uchi
1.1257PR #385by dentity007
1.1488PR #393by CrimsonSithria
1.2417PR #394by greqone
1.1247PR #408by markste-in
1.4784PR #422by albertorkive
1.1396PR #436by CrimsonSithria
1.2392PR #442by sjp611
1.1027PR #467by ADIITJ
1.1428PR #512by MatoTeziTanka
0.9512PR #517by lukacf
0.9789PR #548by LoquiAuris
1.0865PR #568by MatoTeziTanka
0.7853PR #583by suchihype
1.1489PR #589by RoyiRa
1.1178PR #628by Christopher-Lee-McClendon
1.0983PR #633by MatoTeziTanka
1.1526PR #649by pall23-mech
1.2073PR #659by deanbrr
1.0920PR #664by tsbiosky
1.2982PR #674by newjordan
1.0461PR #678by SPThole
1.3525PR #681by Alfaxad
1.4775PR #684by DeepReinforce
1.0574PR #702by lukacf
1.0244PR #706by newjordan
1.0461PR #714by Upsalla
1.1187PR #730by janwww
1.1570PR #738by gowtham0992
1.0970PR #759by markste-in
1.3092PR #767by RichiiiTV
0.9209PR #771by sunnypatneedi
1.0705PR #773by siddhantparadox
1.1532PR #777by Robby955
0.9623PR #785by SirSaltySalmon
1.5364PR #797by armantsaturian
0.8960PR #852by Prush69
1.1189PR #894by albertorkive
1.1821PR #905by anthony-maio
1.8587PR #914by mkenney2
1.1873PR #920by CiprianFlorin-Ifrim
1.1539PR #924by THUQiXuan
0.0280PR #925by THUQiXuan
0.0281PR #979by 0xadvait
1.1387PR #992by TimS-ml
1.4054PR #994by singhaikshitijjain
1.4315PR #999by aamodbhatt
1.1179PR #1002by SoHarshh
1.1650PR #1009by SoHarshh
1.1574PR #1019by abaybektursunRECORD
1.1147PR #1026by danielxmed
1.0945PR #1028by newjordan
0.8104PR #1030by sofiabod
0.1130PR #1055by sanyalsunny111
0.9693PR #1056by sofiabod
0.0180PR #1068by LappyG
1.1510PR #1092by teddyoweh
1.1219PR #1097by danielxmed
1.3355PR #1100by agalimova
1.1465PR #1106by agalimova
1.1465PR #1107by mradassaad
1.5633PR #1120by newjordan
1.1099PR #1130by Gusanidas
1.1140PR #1139by ivanontech
1.1801PR #1140by newjordan
1.1874PR #1152by ericdatum
1.7942PR #1154by LucasErcolano
1.7757PR #1169by Bortlesboat
1.1126PR #1212by Gusanidas
1.1108PR #1227by himanshudongre
1.4841PR #1241by aiejvn
0.9901PR #1245by mkenney2
1.1470PR #1254by Elarwei001
1.1070PR #1263by xexyz
0.9354PR #1273by DushyantChetiwal
1.2196PR #1280by aamodbhatt
1.1156PR #1283by newjordan
1.1373PR #1293by 5en5e1
1.2409PR #1302by vlivashkin
1.1078PR #1308by newjordan
1.1364PR #1312by adi-suresh01
1.3299PR #1322by newjordan
1.0854PR #1325by monisha-max
1.3868PR #1337by sergimichi
1.2079PR #1349by LocalX991
1.3693PR #1364by stukenov
1.1025PR #1371by aarjunsrinivasan
1.4709PR #1388by CiprianFlorin-Ifrim
1.5390PR #1392by Its-Just-Crump
1.1020PR #1395by dttdrv
1.0924PR #1396by erichroepke
1.1067PR #1400by tmancino
1.1035PR #1403by Rhoahndur
1.3485PR #1407by OnlyJundong
1.0960PR #1411by Blakethefn
1.5568PR #1424by OnlyJundong
1.0858PR #1434by ranausmanai
1.5207PR #1449by codeprakhar25
1.3680PR #1458by newjordan
1.1057PR #1479by andrewbaggio1
1.1450PR #1486by AlirezaAlampour
1.6656PR #1489by joshkmartinez
1.0736PR #1494by G3sparky
1.1220PR #1501by SPThole
1.1159PR #1502by SPThole
1.1147PR #1509by Lumi-node
1.1962PR #1570by yufang67
1.0970PR #1581by aiejvn
1.2321PR #1607by inin-zou
1.4765PR #1626by dexhunter
1.0719PR #1699by lsb
1.4831PR #1709by Bananakin1
1.1470PR #1732by Victory963
1.0785Hyperparameters Across PRs
| pr_number | parameters |
|---|---|
| 59 | {"train_length":1024,"eval_length":2048} |
| 61 | — |
| 69 | — |
| 85 | {"rope_base":50000} |
| 86 | — |
| 120 | {"base":200000} |
| 126 | — |
| 139 | {"base":200000} |
| 156 | {"q_gain_init":1.5} |
| 160 | — |
| 163 | {"base":50000} |
| 186 | {"base":50000} |
| 198 | — |
| 201 | — |
| 206 | {"base":50000} |
| 215 | {"base":10000} |
| 216 | — |
| 218 | {"sequence_length":2048} |
| 223 | {"base":50000} |
| 231 | — |
| 243 | {"base":50000} |
| 249 | {"base":50000} |
| 254 | {"base":50000} |
| 265 | {"train_seq_len":1024,"auto_scales_at":2048} |
| 281 | {"base":10000} |
| 287 | — |
| 290 | {"base":50000} |
| 298 | {"base":500000} |
| 305 | {"base":10000} |
| 318 | {"train_seq_len":1024,"cache_tokens":8192,"effective_context":"50K+"} |
| 324 | {"context_length":4096} |
| 325 | {"dimensions":16} |
| 326 | {"eval_length":4096} |
| 330 | {"sequence_length":2048} |
| 333 | {"base":50000} |
| 367 | {"base":200000} |
| 369 | {"train_seq_len":1024} |
| 379 | {"dimensions":"16/64"} |
| 385 | — |
| 393 | {"base":50000} |
| 394 | {"dimensions":16} |
| 408 | {"rope_base":100000} |
| 422 | — |
| 436 | {"base":50000} |
| 442 | {"dimensions":16,"base_dimensions":64} |
| 467 | — |
| 512 | {"base":50000} |
| 517 | {"dimensions":"16/64"} |
| 548 | {"persistent":false} |
| 568 | {"base":50000} |
| 583 | {"base":50000,"partial":false} |
| 589 | {"dimensions":16,"total_dimensions":64} |
| 628 | {"partial_dims":"16/64","train_seq_length":2048} |
| 633 | {"base":50000} |
| 649 | {"dimensions":16} |
| 659 | {"dimensions":16,"total_dimensions":64} |
| 664 | — |
| 674 | {"dims":24} |
| 678 | {"rope_dims":64,"rope_base":10000} |
| 681 | {"dimensions":"16/64"} |
| 684 | {"dimensions":null} |
| 702 | {"dimensions":"16/64"} |
| 706 | {"dimensions":24} |
| 714 | {"train_seq_len":2048} |
| 730 | {"max_len":2048,"rope_base":5000} |
| 738 | {"dimensions":16,"base_dimensions":64} |
| 759 | {"base":100000} |
| 767 | {"dimensions":24} |
| 771 | {"dimensions":16,"total_dimensions":64} |
| 773 | {"rope_dims":16} |
| 777 | {"dims":"16/64"} |
| 785 | {"dimensions":16} |
| 797 | {"dimensions":"16/64"} |
| 852 | — |
| 894 | — |
| 905 | — |
| 914 | — |
| 920 | {"type":"yarn","max_len":2048} |
| 924 | {"dimensions":"16/64"} |
| 925 | {"dimensions":"16/64"} |
| 979 | — |
| 992 | — |
| 994 | — |
| 999 | {"dimensions":16} |
| 1002 | {"dimensions":16,"total_dimensions":64} |
| 1009 | {"dimensions":16} |
| 1019 | {"dimensions":16,"total_dimensions":64} |
| 1026 | {"dimensions":16,"total_dimensions":64} |
| 1028 | {"dimensions":16} |
| 1030 | {"dimensions":16} |
| 1055 | {"base":50000} |
| 1056 | {"dimensions":16} |
| 1068 | {"rope_base":10000} |
| 1092 | {"dimensions":16} |
| 1097 | — |
| 1100 | — |
| 1106 | — |
| 1107 | — |
| 1120 | {"dimensions":16} |
| 1130 | {"dimensions":"16/64"} |
| 1139 | — |
| 1140 | {"value":16} |
| 1152 | — |
| 1154 | — |
| 1169 | {"partial_dim":16} |
| 1212 | — |
| 1227 | {"dimensions":16} |
| 1241 | — |
| 1245 | — |
| 1254 | — |
| 1263 | — |
| 1273 | {"base":10000} |
| 1280 | {"dimensions":16,"base_dimensions":64} |
| 1283 | {"scales":[9,1,1]} |
| 1293 | — |
| 1302 | {"dimensions":"16/64"} |
| 1308 | {"scales":[9,1,1]} |
| 1312 | — |
| 1322 | {"dimensions":16,"base":10000} |
| 1325 | {"max_len":2048} |
| 1337 | — |
| 1349 | {"dimensions":16} |
| 1364 | {"dimensions":16,"total_dimensions":64} |
| 1371 | — |
| 1388 | {"base":5000,"max_len":2048} |
| 1392 | {"dimensions":16,"total_dimensions":64} |
| 1395 | {"dimensions":"16/64"} |
| 1396 | {"dimensions":16} |
| 1400 | {"dimensions":16} |
| 1403 | — |
| 1407 | {"dimensions":16,"total_dimensions":64} |
| 1411 | {"bases":[1000,10000,100000,1000000]} |
| 1424 | {"dimensions":16,"total_dimensions":64} |
| 1434 | {"base":5000} |
| 1449 | — |
| 1458 | {"dimensions":16} |
| 1479 | {"dimensions":16} |
| 1486 | {"base":10000} |
| 1489 | {"base":10000,"train_seq":2048} |
| 1494 | {"dimensions":16,"total_dimensions":64} |
| 1501 | {"dimensions":16,"of_total":64} |
| 1502 | {"dimensions":"16/64"} |
| 1509 | {"epsilon":0.1} |
| 1570 | {"dimensions":32} |
| 1581 | — |
| 1607 | {"base":10000} |
| 1626 | {"dimensions":16,"base_dimensions":64} |
| 1699 | {"base":10000} |
| 1709 | {"dimensions":16,"base":10000} |
| 1732 | {"rotary_dims":16,"total_dims":64} |