@@ -10,13 +10,19 @@ to provide synchronization of initialization of local static objects, and by
1010* ** boost** : ` boost::mutex `
1111* ** mcf0i** : ` _MCF_mutex ` without inlining
1212
13- ![ hyperfine] ( doc/hyperfine.png )
14-
1513> [ !WARNING]
1614> This project uses some undocumented NT system calls and is not guaranteed to
1715> work on some Windows versions. The author gives no warranty for this project.
1816> Use it at your own risk.
1917
18+ ## Benchmark Result
19+
20+ This is the result of [ a benchmark program] ( doc/mutex_benchmark.c ) on Windows
21+ 11 Insider Preview (Dev channel, Build 26300.7760) on an Intel i9 14900K
22+ processor:
23+
24+ ![ result_win11_26300_i9_10900k] ( doc/result_win11_26300_i9_10900k.png )
25+
2026## How to Build
2127
2228Compiling natively can be done in MSYS2. We take the UCRT64 shell as an example.
@@ -51,57 +57,6 @@ ninja test
5157> ` __cxa_finalize(&__dso_handle) ` followed by ` fflush(NULL) ` upon receipt of
5258> ` DLL_PROCESS_DETACH ` in your ` DllMain() ` .
5359
54- ## Benchmarking
55-
56- * ** #THREADS** : number of threads
57- * ** #ITERATIONS** : number of iterations per thread
58- * ** SRWLOCK** : Windows ` SRWLOCK `
59- * ** CRITICAL_SECTION** : Windows ` CRITICAL_SECTION `
60- * ** WINPTHREAD** : winpthread ` pthread_mutex_t `
61- * ** MCFGTHREAD** : mcfgthread ` _MCF_mutex ` without inlining
62-
63- These are results of [ the test program] ( doc/mutex_performance.c ) on an x86-64
64- * Windows 10* machine with a 10-core * Intel i9 10900K* processor:
65-
66- | #THREADS | #ITERATIONS | SRWLOCK | CRITICAL_SECTION | WINPTHREAD | MCFGTHREAD |
67- | ---------:| ------------:| --------------:| -----------------:| --------------:| --------------:|
68- | 1 | 20,000,000 | 1541.035 ms | 1684.556 ms | ** 1537.788 ms** | 1539.504 ms |
69- | 2 | 10,000,000 | 1410.687 ms | 1916.520 ms | 2135.853 ms | ** 1377.103 ms** |
70- | 4 | 5,000,000 | 2070.238 ms | 4613.832 ms | 2979.166 ms | ** 1553.278 ms** |
71- | 6 | 3,000,000 | 2500.003 ms | 5016.650 ms | 3159.182 ms | ** 1409.130 ms** |
72- | 10 | 1,500,000 | 2416.953 ms | 6239.123 ms | 3004.653 ms | ** 1177.269 ms** |
73- | 20 | 600,000 | 2266.024 ms | 8687.350 ms | 2559.691 ms | ** 1001.314 ms** |
74- | 60 | 200,000 | ** 2831.348 ms** | 10164.012 ms | 3814.880 ms | 3299.509 ms |
75- | 200 | 60,000 | ** 2849.850 ms** | 10544.007 ms | 3825.518 ms | 3579.925 ms |
76-
77- And these are results of the same program on * Wine 6.0.3* on an x86-64
78- * Ubuntu 22.04* virtual machine with a 16-core * AMD EPYC2* processor:
79-
80- | #THREADS | #ITERATIONS | SRWLOCK | CRITICAL_SECTION | WINPTHREAD | MCFGTHREAD |
81- | ---------:| ------------:| --------------:| -----------------:| --------------:| --------------:|
82- | 1 | 10,000,000 | 2466.983 ms | 2574.892 ms | ** 2444.599 ms** | 3167.704 ms |
83- | 2 | 5,000,000 | 1940.147 ms | ** 1918.091 ms** | 2078.076 ms | 2213.607 ms |
84- | 4 | 2,000,000 | 3717.442 ms | 5356.369 ms | 3859.484 ms | ** 1974.007 ms** |
85- | 6 | 1,000,000 | 3517.333 ms | 4519.209 ms | 2474.208 ms | ** 1582.614 ms** |
86- | 10 | 500,000 | 3105.191 ms | 4706.027 ms | 2388.662 ms | ** 1363.926 ms** |
87- | 20 | 200,000 | 2721.077 ms | 4262.151 ms | 1966.195 ms | ** 1340.997 ms** |
88- | 60 | 60,000 | 2397.048 ms | 3807.141 ms | 1530.147 ms | ** 1511.931 ms** |
89- | 200 | 20,000 | 2632.933 ms | 4148.604 ms | ** 1615.904 ms** | 1784.553 ms |
90-
91- And these are results of the same program on an ARM * Windows 11* machine with
92- an 8-core * Qualcomm Snapdragon 8cx Gen 3* processor, compiled with Clang:
93-
94- | #THREADS | #ITERATIONS | SRWLOCK | CRITICAL_SECTION | WINPTHREAD | MCFGTHREAD |
95- | ---------:| ------------:| --------------:| -----------------:| --------------:| --------------:|
96- | 1 | 10,000,000 | 2105.027 ms | 2164.209 ms | 2122.998 ms | ** 2033.915 ms** |
97- | 2 | 5,000,000 | 1701.007 ms | 1620.484 ms | 1547.963 ms | ** 1496.309 ms** |
98- | 4 | 2,000,000 | ** 1395.439 ms** | 3067.075 ms | 2583.215 ms | 1525.453 ms |
99- | 6 | 1,000,000 | ** 1181.352 ms** | 4334.280 ms | 2167.916 ms | 1354.046 ms |
100- | 10 | 500,000 | 2738.153 ms | 2799.624 ms | ** 2687.904 ms** | 2739.022 ms |
101- | 20 | 100,000 | 3259.999 ms | ** 3220.732 ms** | 3287.581 ms | 3291.146 ms |
102- | 60 | 30,000 | 2931.157 ms | 2934.896 ms | 2938.784 ms | ** 2922.015 ms** |
103- | 200 | 10,000 | ** 3197.414 ms** | 3216.323 ms | 3221.090 ms | 3229.249 ms |
104-
10560## Implementation details
10661
10762### The condition variable
0 commit comments