Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] remove yield from atomic::wait #120012

Merged
merged 7 commits into from
Jan 18, 2025
Merged

Conversation

huixie90
Copy link
Contributor

@huixie90 huixie90 commented Dec 15, 2024

This is to address the issue where yield can cause the thread to be assigned to the lowest priority.
I have done lots of experiments: see the comments here:
#84471 (comment)

And for this patch, the benchmark has been performed on a 16 core M4 MAX CPU MacBook Pro.
dylib compiled with Release mode and the test compiled with optimization=speed

Comparing ../../../build_atomic_yield2/ref_new2.json to ../../../build_atomic_yield2/no_yield_new2.json
Benchmark                                                                                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<0>>/262144                                              +0.0460         +0.0392      14949926      15637503      13633314      14167327
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<0>>/524288                                              +0.0299         +0.0290      24369327      25099004      24367214      25073900
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<0>>/1048576                                             +0.0648         +0.0640      48149060      51268517      48144857      51226733
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<0>>/4096                                           +0.0000         -0.8765     204815500     204823427     204514333      25265071
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<0>>/8192                                           +0.0000         -0.8747     409637520     409640821     408997500      51228071
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<0>>/16384                                          +0.0001         -0.8737     819244417     819351256     817022000     103217000
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<0>>/4096                                          +0.0000         -0.9029     409607694     409624937     271866333      26410600
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<0>>/8192                                          +0.0001         -0.9017     819168417     819269339     542784000      53352429
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<0>>/16384                                         +0.0001         -0.9012    1638361750    1638522929    1089486000     107684571
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<4>>/262144                                              +0.3178         +0.3068      12777744      16838266      12764732      16681233
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<4>>/524288                                              +0.2231         +0.2225      26889415      32887842      26864138      32840550
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<4>>/1048576                                             +0.1809         +0.1799      56103004      66251660      56048000      66129583
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<4>>/4096                                           -0.0029         -0.8708     205509986     204906011     204277333      26399538
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<4>>/8192                                           +0.0001         -0.8711     410286709     410314199     408608000      52667692
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<4>>/16384                                          -0.0019         -0.8713     821042916     819476441     816274000     105077000
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<4>>/4096                                          -0.0005         -0.9015     409825792     409638429     273145333      26896400
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<4>>/8192                                          -0.0027         -0.9014     821528125     819285433     545661000      53775308
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<4>>/16384                                         -0.0041         -0.9014    1645204459    1638538077    1091726000     107647000
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<7>>/16                                                  -0.4835         -0.4836          1609           831          1609           831
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<7>>/32                                                  -0.4398         -0.4399          3167          1774          3166          1773
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<7>>/64                                                  -0.4705         -0.4705          6323          3348          6323          3348
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<7>>/8                                              +0.0005         -0.8683        400109        400314        399256         52575
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<7>>/16                                             +0.0005         -0.8683        800055        800483        798797        105165
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<7>>/32                                             +0.0003         -0.8680       1600058       1600585       1597266        210903
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<7>>/8                                             +0.0004         -0.8976        800006        800365        531802         54441
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<7>>/16                                            +0.0005         -0.8982       1599965       1600765       1064885        108429
BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<7>>/32                                            +0.0005         -0.8993       3199905       3201437       2129243        214343
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<0>>/16384                     -0.0226         -0.0261        972539        950519        971198        945828
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<0>>/32768                     -0.0198         -0.0221       1933294       1895054       1930720       1888094
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<0>>/65536                     -0.0031         -0.0039       3835138       3823094       3827785       3812836
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<0>>/4096                      +0.4380         +0.4294        571762        822185        570245        815115
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<0>>/8192                      +0.0735         +0.0680       1223881       1313880       1221350       1304439
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<0>>/16384                     +0.1222         +0.1205       2442071       2740519       2433105       2726274
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<0>>/1024                     +0.1527         +1.2188        196081        226031         62647        139001
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<0>>/2048                     +0.0757         +0.4838        387858        417228        129250        191780
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<0>>/4096                     -0.0355         -0.2443        812827        784003        378109        285722
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<0>>/1024                 +0.0002         -0.0873      51202059      51211089      51135714      46670867
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<0>>/2048                 +0.0001         -0.0864     102424970     102432359     102287571      93452000
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<0>>/4096                 +0.0000         -0.0865     204828250     204834229     204528667     186845250
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<0>>/256                  +0.0003         -0.1681      12801752      12805016      12786382      10636485
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<0>>/512                  +0.0001         -0.1686      25601940      25604893      25565481      21254515
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<0>>/1024                 +0.0000         -0.1569      51210789      51211539      51150143      43122500
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<0>>/64                  +0.0064         -0.3503       3210430       3230869       2856780       1856063
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<0>>/128                 +0.0034         -0.3534       6410529       6432308       5704792       3688942
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<0>>/256                 +0.0011         -0.3600      12821419      12835646      11455934       7331250
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<0>>/256                 +0.0003         +0.0034      25600089      25608062      24375034      24457172
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<0>>/512                 +0.0002         -0.0000      51203798      51211795      48859857      48858500
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<0>>/1024                +0.0003         +0.0008     102411321     102437524      97694429      97777286
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<0>>/64                  +0.0002         -0.0464       6399846       6401009       6070487       5789091
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<0>>/128                 +0.0002         -0.0457      12799914      12802544      12069966      11518836
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<0>>/256                 +0.0001         -0.0513      25599724      25602105      24202862      22962032
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<0>>/16                 -0.0060         +0.2575       1611779       1602148        956236       1202492
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<0>>/32                 -0.0064         +0.2964       3221485       3200918       1883540       2441728
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<0>>/64                 -0.0046         +0.3087       6432692       6403368       3701725       4844611
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<4>>/256                       -0.0536         -0.0592         27458         25988         27402         25780
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<4>>/512                       -0.0469         -0.0527         54745         52175         54628         51750
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<4>>/1024                      -0.0297         -0.0340        108312        105095        108047        104378
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<4>>/64                        -0.2445         -0.2722         15109         11414         14711         10708
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<4>>/128                       -0.3132         -0.3515         32494         22317         32063         20794
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<4>>/256                       -0.1397         -0.1834         52801         45424         52170         42602
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<4>>/16                       +0.1679         +1.0248         28973         33837         13243         26814
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<4>>/32                       -0.0481         +0.7901         39155         37273         16072         28771
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<4>>/64                       -0.2075         +0.7568         57547         45606         19582         34402
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<4>>/256                  -0.0001         -0.0807      12802693      12800886      12775327      11744119
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<4>>/512                  -0.0021         -0.0867      25655056      25601315      25590407      23371667
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<4>>/1024                 -0.0007         -0.0832      51238801      51201975      51099071      46845733
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<4>>/64                   +0.0016         -0.2411       3200714       3205846       3176841       2410756
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<4>>/128                  +0.0008         -0.2373       6404239       6409102       6359649       4850544
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<4>>/256                  +0.0000         -0.2286      12805839      12806032      12713018       9806653
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<4>>/16                  +0.0272         +0.0563        811198        833264        482220        509345
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<4>>/32                  +0.0097         +0.0454       1617205       1632962        957801       1001264
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<4>>/64                  +0.0050         +0.0389       3217997       3234130       1927921       2002868
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<4>>/256                 +0.0000         -0.0009      25599763      25601039      24520071      24497071
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<4>>/512                 +0.0001         -0.0017      51200354      51203628      49086786      49005500
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<4>>/1024                +0.0001         +0.0013     102400369     102409744      97931143      98060857
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<4>>/64                  -0.0017         +0.0128       6410821       6400104       5529150       5600008
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<4>>/128                 -0.0011         +0.0215      12817263      12803569      11025889      11263032
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<4>>/256                 -0.0005         +0.0193      25612704      25600332      22089065      22515677
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<4>>/16                 -0.0164         +0.7969       1627422       1600798        665736       1196236
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<4>>/32                 -0.0095         +0.8362       3231500       3200840       1290017       2368789
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<4>>/64                 -0.0050         +0.7319       6433401       6401180       2747936       4759115
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<7>>/16                        +0.0155         +0.0092          1177          1195          1171          1181
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<7>>/32                        -0.0135         -0.0145          2103          2074          2095          2064
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<7>>/64                        +0.0022         +0.0009          3832          3841          3820          3823
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<7>>/8                        +13.9131         +9.5298          2074         30931          2041         21495
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<7>>/16                        +5.9980         +3.9816          3168         22172          3124         15563
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<7>>/32                        +3.8681         +2.3515          5412         26348          5321         17833
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<7>>/4                        +0.1312         +0.4845         31938         36127         12666         18803
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<7>>/8                        -0.0475         +0.0775         39196         37336         18078         19479
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<7>>/16                       -0.3146         -0.3853         57548         39441         31743         19513
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<7>>/8                    -0.0012         -0.0916        400610        400149        399248        362679
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<7>>/16                   -0.0032         -0.0904        802940        800342        798964        726744
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<7>>/32                   -0.0030         -0.0911       1604860       1600044       1598235       1452647
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<7>>/4                    +0.0348         -0.3515        202073        209107        199452        129352
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<7>>/8                    -0.0004         -0.3628        406727        406545        400942        255464
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<7>>/16                   -0.0176         -0.3705        821725        807256        803722        505959
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<7>>/2                   +0.0575         +0.0699        138530        146498         79463         85020
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<7>>/4                   -0.2307         -0.4182        327417        251885        222502        129448
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<7>>/8                   -0.4166         -0.5733        765495        446598        535265        228384
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<7>>/8                   +0.0001         +0.0022        800108        800227        759501        761200
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<7>>/16                  +0.0002         +0.0052       1599998       1600327       1515336       1523162
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<7>>/32                  -0.0004         +0.0029       3201730       3200529       3037191       3045996
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<7>>/4                   -0.0063         +0.3625        402752        400231        231304        315156
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<7>>/8                   -0.0029         +0.5760        802313        799998        401474        632716
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<7>>/16                  -0.0014         +0.4607       1602184       1600012        877859       1282310
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<7>>/2                  -0.0492         +0.3586        212875        202398        100437        136457
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<7>>/4                  -0.0927         +0.4432        444857        403606        181089        261350
BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<7>>/8                  -0.0704         +0.8210        861808        801099        318774        580489
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<0>>/4096                           -0.0730         -0.0762        333804        309427        333180        307803
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<0>>/8192                           -0.0775         -0.0795        701228        646853        700065        644381
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<0>>/16384                          +0.0245         +0.0229       1328777       1361291       1326360       1356745
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<3>, NumHighPrioTasks<0>>/1024                           -0.0541         -0.0562        201559        190662        201259        189940
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<3>, NumHighPrioTasks<0>>/2048                           -0.1959         -0.1986        416092        334584        415412        332927
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<3>, NumHighPrioTasks<0>>/4096                           -0.1699         -0.1710        811966        674040        810157        671584
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<5>, NumHighPrioTasks<0>>/1024                           +0.1383         +0.1301        379893        432426        377756        426885
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<5>, NumHighPrioTasks<0>>/2048                           +0.0396         +0.0339        822384        854937        818110        845866
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<5>, NumHighPrioTasks<0>>/4096                           +0.2499         +0.2451       1350161       1687588       1345121       1674845
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<7>, NumHighPrioTasks<0>>/256                            +0.0042         +0.0101        213598        214487        199282        201303
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<7>, NumHighPrioTasks<0>>/512                            -0.1034         -0.1065        428033        383755        409546        365945
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<7>, NumHighPrioTasks<0>>/1024                           -0.0972         -0.1064        833189        752165        810146        723952
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<2>, NumHighPrioTasks<0>>/1024                      +0.0001         -0.1103      51201684      51204581      51124714      45485867
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<2>, NumHighPrioTasks<0>>/2048                      -0.0000         -0.1202     102409167     102405120     102243857      89953750
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<2>, NumHighPrioTasks<0>>/4096                      +0.0000         -0.1166     204807125     204813833     204453333     180618500
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<3>, NumHighPrioTasks<0>>/256                       +0.0002         -0.1623      12803624      12806161      12778727      10704806
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<3>, NumHighPrioTasks<0>>/512                       -0.0002         -0.1414      25607327      25603223      25551852      21939152
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<3>, NumHighPrioTasks<0>>/1024                      -0.0002         -0.1653      51212196      51202776      51126643      42673625
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<5>, NumHighPrioTasks<0>>/256                       -0.0002         -0.0709      12805016      12802157      12784636      11878785
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<5>, NumHighPrioTasks<0>>/512                       -0.0002         -0.1182      25611565      25606346      25560815      22540033
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<5>, NumHighPrioTasks<0>>/1024                      -0.0002         -0.0813      51220762      51208122      51121071      46963571
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<7>, NumHighPrioTasks<0>>/64                        +0.0012         -0.2125       3219858       3223858       3194027       2515373
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<7>, NumHighPrioTasks<0>>/128                       -0.0370         -0.2643       6668396       6421601       6563402       4828970
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<7>, NumHighPrioTasks<0>>/256                       -0.0288         -0.2220      13220067      12839487      13073964      10172062
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<2>, NumHighPrioTasks<0>>/256                      -0.0000         -0.0105      25602159      25600917      24178138      23923759
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<2>, NumHighPrioTasks<0>>/512                      +0.0000         -0.0175      51201819      51203125      48569867      47718143
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<2>, NumHighPrioTasks<0>>/1024                     +0.0001         -0.0118     102404155     102414482      96908714      95760857
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<3>, NumHighPrioTasks<0>>/256                      +0.0000         -0.0574      25599943      25600621      25326679      23871733
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<3>, NumHighPrioTasks<0>>/512                      +0.0001         -0.0813      51200525      51206978      50459500      46355867
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<3>, NumHighPrioTasks<0>>/1024                     +0.0001         -0.0774     102400405     102409875     101483571      93631000
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<5>, NumHighPrioTasks<0>>/128                      -0.0002         +0.0456      12802792      12800864      11731131      12265881
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<5>, NumHighPrioTasks<0>>/256                      -0.0000         +0.0649      25601667      25601070      22686065      24157862
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<5>, NumHighPrioTasks<0>>/512                      -0.0005         +0.0513      51224453      51200650      45549867      47885533
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<7>, NumHighPrioTasks<0>>/64                       -0.0014         +0.2205       6408711       6400039       4698868       5734934
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<7>, NumHighPrioTasks<0>>/128                      +0.0155         +0.2459      12810413      13009276       9163080      11416117
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<7>, NumHighPrioTasks<0>>/256                      +0.0081         +0.2304      25603646      25811111      18779784      23106867
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<4>>/128                            -0.1103         -0.1108         24307         21625         24256         21568
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<4>>/256                            +0.0637         +0.0574         45588         48491         45498         48112
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<4>>/512                            -0.0519         -0.0539         90764         86054         90527         85648
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<3>, NumHighPrioTasks<4>>/128                            +0.1161         +0.1083         28810         32155         28722         31832
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<3>, NumHighPrioTasks<4>>/256                            +0.1152         +0.1094         64670         72123         64461         71512
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<3>, NumHighPrioTasks<4>>/512                            -0.0804         -0.0993        125916        115796        125476        113010
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<5>, NumHighPrioTasks<4>>/64                             +0.2682         -0.2446         53787         68210         51896         39203
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<5>, NumHighPrioTasks<4>>/128                            +0.5732         -0.4832        103915        163474        100825         52105
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<5>, NumHighPrioTasks<4>>/256                            +0.1283         -0.4606        211518        238645        203852        109957
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<7>, NumHighPrioTasks<4>>/16                             -0.1526         +0.1523         59673         50567         23275         26819
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<7>, NumHighPrioTasks<4>>/32                             -0.0492         +0.7075         82796         78719         24187         41298
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<7>, NumHighPrioTasks<4>>/64                             -0.0712         +0.0764        150268        139570         55304         59527
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<2>, NumHighPrioTasks<4>>/128                       -0.0004         -0.0828       6402859       6400308       6380145       5851557
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<2>, NumHighPrioTasks<4>>/256                       -0.0002         -0.0370      12802978      12801020      12769107      12296293
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<2>, NumHighPrioTasks<4>>/512                       -0.0028         -0.0799      25674170      25601862      25612667      23566586
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<3>, NumHighPrioTasks<4>>/128                       -0.0004         -0.0672       6402990       6400344       6382100       5953248
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<3>, NumHighPrioTasks<4>>/256                       -0.0004         -0.0841      12806197      12801334      12765891      11691661
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<3>, NumHighPrioTasks<4>>/512                       -0.0006         -0.0574      25615708      25601085      25533250      24067828
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<5>, NumHighPrioTasks<4>>/32                        -0.0163         -0.2801       1645647       1618805       1614735       1162471
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<5>, NumHighPrioTasks<4>>/64                        -0.0211         -0.2501       3285234       3216045       3217295       2412509
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<5>, NumHighPrioTasks<4>>/128                       -0.0502         -0.2956       6755976       6416549       6653264       4686407
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<7>, NumHighPrioTasks<4>>/8                         -0.0815         -0.2227        534476        490942        337482        262341
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<7>, NumHighPrioTasks<4>>/16                        +0.0973         -0.0629       1071127       1175390        664897        623053
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<7>, NumHighPrioTasks<4>>/32                        -0.2263         -0.3717       2297477       1777444       1488023        934861
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<2>, NumHighPrioTasks<4>>/64                       -0.0000         +0.0183       6400348       6400261       6145171       6257342
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<2>, NumHighPrioTasks<4>>/128                      +0.0000         +0.0194      12800545      12800759      12279474      12517804
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<2>, NumHighPrioTasks<4>>/256                      +0.0001         +0.0111      25601568      25602976      24636179      24909821
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<3>, NumHighPrioTasks<4>>/64                       -0.0000         +0.0545       6400600       6400444       5795288       6111077
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<3>, NumHighPrioTasks<4>>/128                      +0.0001         +0.0474      12800507      12801355      11566729      12114860
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<3>, NumHighPrioTasks<4>>/256                      +0.0000         +0.0423      25601503      25601760      23281967      24267276
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<5>, NumHighPrioTasks<4>>/32                       +0.0005         +0.2842       3201968       3203421       2175379       2793615
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<5>, NumHighPrioTasks<4>>/64                       -0.0003         +0.3807       6402555       6400496       4052465       5595309
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<5>, NumHighPrioTasks<4>>/128                      -0.0003         +0.3827      12804155      12800925       8114370      11219400
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<7>, NumHighPrioTasks<4>>/8                        +0.0262         +0.1272        821954        843475        503297        567320
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<7>, NumHighPrioTasks<4>>/16                       +0.0298         +0.3134       1634476       1683172        901978       1184619
BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<7>, NumHighPrioTasks<4>>/32                       +0.0147         +0.2925       3244262       3291994       1721000       2224350
OVERALL_GEOMEAN                                                                                                         +0.0185         -0.1876             0             0             0             0

@huixie90 huixie90 requested a review from a team as a code owner December 15, 2024 15:52
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Dec 15, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 15, 2024

@llvm/pr-subscribers-libcxx

Author: Hui (huixie90)

Changes
  • [libc++] atomic wait more benchmark
  • fix compiler error
  • [libc++] remove yield from atomic::wait

Full diff: https://github.com/llvm/llvm-project/pull/120012.diff

2 Files Affected:

  • (modified) libcxx/include/__atomic/atomic_sync.h (+2-4)
  • (modified) libcxx/test/benchmarks/atomic_wait.bench.cpp (+275-19)
diff --git a/libcxx/include/__atomic/atomic_sync.h b/libcxx/include/__atomic/atomic_sync.h
index 153001e7b62e30..5ec792e9b9a29c 100644
--- a/libcxx/include/__atomic/atomic_sync.h
+++ b/libcxx/include/__atomic/atomic_sync.h
@@ -108,15 +108,13 @@ struct __atomic_wait_backoff_impl {
 
   _LIBCPP_AVAILABILITY_SYNC
   _LIBCPP_HIDE_FROM_ABI bool operator()(chrono::nanoseconds __elapsed) const {
-    if (__elapsed > chrono::microseconds(64)) {
+    if (__elapsed > chrono::microseconds(4)) {
       auto __contention_address = __waitable_traits::__atomic_contention_address(__a_);
       __cxx_contention_t __monitor_val;
       if (__update_monitor_val_and_poll(__contention_address, __monitor_val))
         return true;
       std::__libcpp_atomic_wait(__contention_address, __monitor_val);
-    } else if (__elapsed > chrono::microseconds(4))
-      __libcpp_thread_yield();
-    else {
+    } else {
     } // poll
     return false;
   }
diff --git a/libcxx/test/benchmarks/atomic_wait.bench.cpp b/libcxx/test/benchmarks/atomic_wait.bench.cpp
index d19f5fbed8ad60..b85aec49471729 100644
--- a/libcxx/test/benchmarks/atomic_wait.bench.cpp
+++ b/libcxx/test/benchmarks/atomic_wait.bench.cpp
@@ -12,21 +12,88 @@
 #include <cstdint>
 #include <numeric>
 #include <stop_token>
+#include <pthread.h>
+#include <sched.h>
 #include <thread>
+#include <chrono>
+#include <array>
 
 #include "benchmark/benchmark.h"
 #include "make_test_thread.h"
 
 using namespace std::chrono_literals;
 
-void BM_atomic_wait_one_thread_one_atomic_wait(benchmark::State& state) {
-  std::atomic<std::uint64_t> a;
-  auto thread_func = [&](std::stop_token st) {
+struct HighPrioTask {
+  sched_param param;
+  pthread_attr_t attr_t;
+  pthread_t thread;
+  std::atomic_bool stopped{false};
+
+  HighPrioTask(const HighPrioTask&) = delete;
+
+  HighPrioTask() {
+    pthread_attr_init(&attr_t);
+    pthread_attr_setschedpolicy(&attr_t, SCHED_FIFO);
+    param.sched_priority = sched_get_priority_max(SCHED_FIFO);
+    pthread_attr_setschedparam(&attr_t, &param);
+    pthread_attr_setinheritsched(&attr_t, PTHREAD_EXPLICIT_SCHED);
+
+    auto thread_fun = [](void* arg) -> void* {
+      auto* stop = reinterpret_cast<std::atomic_bool*>(arg);
+      while (!stop->load(std::memory_order_relaxed)) {
+        // spin
+      }
+      return nullptr;
+    };
+
+    if (pthread_create(&thread, &attr_t, thread_fun, &stopped) != 0) {
+      throw std::runtime_error("failed to create thread");
+    }
+  }
+
+  ~HighPrioTask() {
+    stopped = true;
+    pthread_attr_destroy(&attr_t);
+    pthread_join(thread, nullptr);
+  }
+};
+
+
+template <std::size_t N>
+struct NumHighPrioTasks {
+  static constexpr auto value = N;
+};
+
+
+struct KeepNotifying {
+  template <class Atomic>
+  static void notify(Atomic& a, std::stop_token st) {
     while (!st.stop_requested()) {
       a.fetch_add(1, std::memory_order_relaxed);
       a.notify_all();
     }
-  };
+  }
+};
+
+template <std::size_t N>
+struct NotifyEveryNus {
+  template <class Atomic>
+  static void notify(Atomic& a, std::stop_token st) {
+    while (!st.stop_requested()) {
+      auto start = std::chrono::system_clock::now();
+      a.fetch_add(1, std::memory_order_relaxed);
+      a.notify_all();
+      while (std::chrono::system_clock::now() - start < std::chrono::microseconds{N}) {
+      }
+    }
+  }
+};
+
+template <class NotifyPolicy, class NumPrioTasks>
+void BM_1_atomic_1_waiter_1_notifier(benchmark::State& state) {
+  [[maybe_unused]] std::array<HighPrioTask, NumPrioTasks::value> tasks{};
+  std::atomic<std::uint64_t> a;
+  auto thread_func = [&](std::stop_token st) { NotifyPolicy::notify(a, st); };
 
   std::uint64_t total_loop_test_param = state.range(0);
 
@@ -39,19 +106,34 @@ void BM_atomic_wait_one_thread_one_atomic_wait(benchmark::State& state) {
     }
   }
 }
-BENCHMARK(BM_atomic_wait_one_thread_one_atomic_wait)->RangeMultiplier(2)->Range(1 << 10, 1 << 24);
 
-void BM_atomic_wait_multi_thread_one_atomic_wait(benchmark::State& state) {
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<0>>)->RangeMultiplier(2)->Range(1 << 18, 1 << 20);
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<0>>)->RangeMultiplier(2)->Range(1 << 12, 1 << 14);
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<0>>)->RangeMultiplier(2)->Range(1 << 12, 1 << 14);
+
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<4>>)->RangeMultiplier(2)->Range(1 << 18, 1 << 20);
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<4>>)->RangeMultiplier(2)->Range(1 << 12, 1 << 14);
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<4>>)->RangeMultiplier(2)->Range(1 << 12, 1 << 14);
+
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<7>>)->RangeMultiplier(2)->Range(1 << 4, 1 << 6);
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<50>, NumHighPrioTasks<7>>)->RangeMultiplier(2)->Range(1 << 3, 1 << 5);
+BENCHMARK(BM_1_atomic_1_waiter_1_notifier<NotifyEveryNus<100>, NumHighPrioTasks<7>>)->RangeMultiplier(2)->Range(1 << 3, 1 << 5);
+
+
+template <std::size_t N>
+struct NumWaitingThreads {
+  static constexpr auto value = N;
+};
+
+template <class NotifyPolicy, class NumWaitingThreads, class NumPrioTasks>
+void BM_1_atomic_multi_waiter_1_notifier(benchmark::State& state) {
+  [[maybe_unused]] std::array<HighPrioTask, NumPrioTasks::value> tasks{};
+
   std::atomic<std::uint64_t> a;
-  auto notify_func = [&](std::stop_token st) {
-    while (!st.stop_requested()) {
-      a.fetch_add(1, std::memory_order_relaxed);
-      a.notify_all();
-    }
-  };
+  auto notify_func = [&](std::stop_token st) { NotifyPolicy::notify(a, st); };
 
   std::uint64_t total_loop_test_param = state.range(0);
-  constexpr auto num_waiting_threads  = 15;
+  constexpr auto num_waiting_threads  = NumWaitingThreads::value;
   std::vector<std::jthread> wait_threads;
   wait_threads.reserve(num_waiting_threads);
 
@@ -93,17 +175,113 @@ void BM_atomic_wait_multi_thread_one_atomic_wait(benchmark::State& state) {
     t.join();
   }
 }
-BENCHMARK(BM_atomic_wait_multi_thread_one_atomic_wait)->RangeMultiplier(2)->Range(1 << 10, 1 << 20);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 14, 1 << 16);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 12, 1 << 14);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 10, 1 << 12);
+
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 10, 1 << 12);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 8, 1 << 10);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 6, 1 << 8);
+
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 8, 1 << 10);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 6, 1 << 8);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<0>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 4, 1 << 6);
+
+
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 8, 1 << 10);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 6, 1 << 8);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 4, 1 << 6);
+
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 8, 1 << 10);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 6, 1 << 8);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 4, 1 << 6);
 
-void BM_atomic_wait_multi_thread_wait_different_atomics(benchmark::State& state) {
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 8, 1 << 10);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 6, 1 << 8);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<4>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 4, 1 << 6);
+
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<3>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 4, 1 << 6);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<7>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 3, 1 << 5);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<15>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 2, 1 << 4);
+
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<3>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 3, 1 << 5);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<7>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 2, 1 << 4);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<50>, NumWaitingThreads<15>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 1, 1 << 3);
+
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<3>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 3, 1 << 5);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<7>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 2, 1 << 4);
+BENCHMARK(BM_1_atomic_multi_waiter_1_notifier<NotifyEveryNus<100>, NumWaitingThreads<15>, NumHighPrioTasks<7>>)
+    ->RangeMultiplier(2)
+    ->Range(1 << 1, 1 << 3);
+
+
+template <std::size_t N>
+struct NumberOfAtomics {
+  static constexpr auto value = N;
+};
+
+template <class NotifyPolicy, class NumberOfAtomics, class NumPrioTasks>
+void BM_N_atomics_N_waiter_N_notifier(benchmark::State& state) {
+  [[maybe_unused]] std::array<HighPrioTask, NumPrioTasks::value> tasks{};
   const std::uint64_t total_loop_test_param = state.range(0);
-  constexpr std::uint64_t num_atomics       = 7;
+  constexpr std::uint64_t num_atomics       = NumberOfAtomics::value;
   std::vector<std::atomic<std::uint64_t>> atomics(num_atomics);
 
   auto notify_func = [&](std::stop_token st, size_t idx) {
     while (!st.stop_requested()) {
-      atomics[idx].fetch_add(1, std::memory_order_relaxed);
-      atomics[idx].notify_all();
+      NotifyPolicy::notify(atomics[idx], st);
     }
   };
 
@@ -154,6 +332,84 @@ void BM_atomic_wait_multi_thread_wait_different_atomics(benchmark::State& state)
     t.join();
   }
 }
-BENCHMARK(BM_atomic_wait_multi_thread_wait_different_atomics)->RangeMultiplier(2)->Range(1 << 10, 1 << 20);
+
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 12, 1 << 14);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<3>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 10, 1 << 12);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<5>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 10, 1 << 12);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<7>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 8, 1 << 10);
+
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<2>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 10, 1 << 12);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<3>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 8, 1 << 10);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<5>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 8, 1 << 10);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<7>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 6, 1 << 8);
+
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<2>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 8, 1 << 10);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<3>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 8, 1 << 10);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<5>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 7, 1 << 9);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<7>, NumHighPrioTasks<0>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 6, 1 << 8);
+
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 7, 1 << 9);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<3>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 7, 1 << 9);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<5>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 6, 1 << 8);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<7>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 4, 1 << 6);
+
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<2>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 7, 1 << 9);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<3>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 7, 1 << 9);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<5>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 5, 1 << 7);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<50>, NumberOfAtomics<7>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 3, 1 << 5);
+
+
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<2>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 6, 1 << 8);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<3>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 6, 1 << 8);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<5>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 5, 1 << 7);
+ BENCHMARK(BM_N_atomics_N_waiter_N_notifier<NotifyEveryNus<100>, NumberOfAtomics<7>, NumHighPrioTasks<4>>)
+     ->RangeMultiplier(2)
+     ->Range(1 << 3, 1 << 5);
 
 BENCHMARK_MAIN();

Copy link

github-actions bot commented Dec 15, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for all the iterations you did on this. It's really difficult to get clear-cut benchmarks showing that this is better, but we've had numerous reports that yielding in this algorithm was hostile to the operating system, so I think this patch needs to land.

@ldionne
Copy link
Member

ldionne commented Jan 9, 2025

(Some CI failures need to be addressed before this can be merged though)

@ldionne
Copy link
Member

ldionne commented Jan 10, 2025

After looking into the CI issues more (https://github.com/llvm/llvm-project/actions/runs/12432576208/job/34712280743?pr=120012), the problem is this:

# .---command stderr------------
  # | In file included from /__w/llvm-project/llvm-project/libcxx/test/benchmarks/atomic_wait_N_waiter_N_notifier.bench.cpp:11:
  # | /__w/llvm-project/llvm-project/libcxx/test/benchmarks/atomic_wait_helper.h:33:43: error: ambiguous use of internal linkage declaration 'PTHREAD_EXPLICIT_SCHED' defined in multiple modules [-Werror,-Wmodules-ambiguous-internal-linkage]
  # |    33 |     pthread_attr_setinheritsched(&attr_t, PTHREAD_EXPLICIT_SCHED);
  # |       |                                           ^
  # | /usr/include/pthread.h:129:33: note: expanded from macro 'PTHREAD_EXPLICIT_SCHED'
  # |   129 | #define PTHREAD_EXPLICIT_SCHED  PTHREAD_EXPLICIT_SCHED
  # |       |                                 ^
  # | /usr/include/pthread.h:128:3: note: declared here
  # |   128 |   PTHREAD_EXPLICIT_SCHED
  # |       |   ^
  # | /usr/include/pthread.h:128:3: note: declared here in module 'std.thread.support'
  # |   128 |   PTHREAD_EXPLICIT_SCHED
  # |       |   ^
  # | 1 error generated.
  # `-----------------------------

For context, we have the following in our modulemap:

module thread {
  ...

  module support {
    header "__thread/support.h"
    export *
  }
  module support_impl {
    textual header "__thread/support/c11.h"
    textual header "__thread/support/external.h"
    textual header "__thread/support/pthread.h" // this includes <pthread.h>
    textual header "__thread/support/windows.h"
  }

  header "thread"
  export *
}

So we're basically including <pthread.h> into the std.thread.support module via a transitive chain of textual includes. That means that, on a platform where /usr/include is not modularized, we're textually including <pthread.h> into std.thread.support and basically exporting everything it declares (including the PTHREAD_EXPLICIT_SCHED macro) from std.thread.support. That's wrong, since we shouldn't be exporting stuff that we don't own.

I'm not sure how to fix that. In some sense, I'd like to be able to say this:

module support {
    header "__thread/support.h"
    export "std::__libcpp_mutex_t"
    export "std::__libcpp_timespec_t"
    etc...
}

but that's not how it works. @ian-twilightcoder do you know how we can solve this problem?

Trying something out here: #122506

@huixie90
Copy link
Contributor Author

For reference, I split the benchmark into 3 test cases because these tests take time. For reference, the test atomic_wait_1_waiter_1_notifier.bench.cpp take 2mins 30seconds on my laptop

@ldionne ldionne added the pending-ci Merging the PR is only pending completion of CI label Jan 13, 2025
@ldionne
Copy link
Member

ldionne commented Jan 17, 2025

Gentle ping @ian-twilightcoder ^

@ian-twilightcoder
Copy link
Contributor

That means that, on a platform where /usr/include is not modularized, we're textually including <pthread.h> into std.thread.support
...
@ian-twilightcoder do you know how we can solve this problem?

You can't. If usr/include is not modularized, you can't use libc++'s clang module either. We used to be able to cheat when libc++ was a single module, and there were no other modules in the build at all, but now that it's multiple modules there's no cheat. If you're going to use clang modules, the entire dependency tree has to be modular. We actually did a talk about that, https://youtu.be/tcB1vXc4L8M?si=Vqr5_sB7wczZDteF

@huixie90 huixie90 merged commit 699f196 into llvm:main Jan 18, 2025
76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. pending-ci Merging the PR is only pending completion of CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants