我非常好奇于不同同步原理的性能,于是对atomic, spinlock和mutex做了如下实验来比较:
1. 无同步的情况
1 #include <future> 2 #include <iostream> 3 4 volatile int value = 0; 5 6 int loop (bool inc, int limit) { 7 std::cout << "Started " << inc << " " << limit << std::endl; 8 for (int i = 0; i < limit; ++i) { 9 if (inc) { 10 ++value; 11 } else { 12 --value; 13 } 14 } 15 return 0; 16 } 17 18 int main () { 19 auto f = std::async (std::launch::async, std::bind(loop, true, 20000000));//开启一个线程来执行loop函数,c++11的高级特性 20 loop (false, 10000000); 21 f.wait (); 22 std::cout << value << std::endl; 23 }
通过clang编译器:
1 clang++ -std=c++11 -stdlib=libc++ -O3 -o test test.cpp && time ./test
运行:
1 SSttaarrtteedd 10 2100000000000000 2 3 11177087 4 5 real 0m0.070s 6 user 0m0.089s 7 sys 0m0.002s
从运行结果很显然的我们可以看出增减不是原子性操作的,变量value最后所包含的值是不确定的(垃圾)。
2. 汇编LOCK
1 #include <future> 2 #include <iostream> 3 4 volatile int value = 0; 5 6 int loop (bool inc, int limit) { 7 std::cout << "Started " << inc << " " << limit << std::endl; 8 for (int i = 0; i < limit; ++i) { 9 if (inc) { 10 asm("LOCK"); 11 ++value; 12 } else { 13 asm("LOCK"); 14 --value; 15 } 16 } 17 return 0; 18 } 19 20 int main () { 21 auto f = std::async (std::launch::async, std::bind(loop, true, 20000000)); //开启一个线程来执行loop函数,c++11的高级特性 22 loop (false, 10000000); 23 f.wait (); 24 std::cout << value << std::endl; 25 }
1 SSttaarrtteedd 10 2000000100000000 2 3 10000000 4 5 real 0m0.481s 6 user 0m0.779s 7 sys 0m0.005s
在最后变量value得到了正确的值,但是这些代码是不可移植的(平台不兼容的),只能在X86体系结构的硬件上运行,而且要想程序能正确运行编译的时候必须使用-O3编译选项。另外,由于编译器会在LOCK指令和增加或者减少指令之间注入其他指令,因此程序很容易出现“illegal instruction”异常从而导致程序被崩溃。
3. 原子操作atomic
1 #include <future> 2 #include <iostream> 3 #include "boost/interprocess/detail/atomic.hpp" 4 5 using namespace boost::interprocess::ipcdetail; 6 7 volatile boost::uint32_t value = 0; 8 9 int loop (bool inc, int limit) { 10 std::cout << "Started " << inc << " " << limit << std::endl; 11 for (int i = 0; i < limit; ++i) { 12 if (inc) { 13 atomic_inc32 (&value); 14 } else { 15 atomic_dec32 (&value); 16 } 17 } 18 return 0; 19 } 20 21 int main () { 22 auto f = std::async (std::launch::async, std::bind (loop, true, 20000000)); 23 loop (false, 10000000); 24 f.wait (); 25 std::cout << atomic_read32 (&value) << std::endl; 26 }
运行:
1 SSttaarrtteedd 10 2100000000000000 2 3 10000000 4 5 real 0m0.457s 6 user 0m0.734s 7 sys 0m0.004s
最后结果是正确的,从所用时间来看跟汇编LOCK的差不多。当然原子操作的底层也是使用了LOCK汇编来实现的,只不过是使用了可移植的方法而已。
4. 自旋锁spinlock
1 #include <future> 2 #include <iostream> 3 #include "boost/smart_ptr/detail/spinlock.hpp" 4 5 boost::detail::spinlock lock; 6 volatile int value = 0; 7 8 int loop (bool inc, int limit) { 9 std::cout << "Started " << inc << " " << limit << std::endl; 10 for (int i = 0; i < limit; ++i) { 11 std::lock_guard<boost::detail::spinlock> guard(lock); 12 if (inc) { 13 ++value; 14 } else { 15 --value; 16 } 17 } 18 return 0; 19 } 20 21 int main () { 22 auto f = std::async (std::launch::async, std::bind (loop, true, 20000000)); 23 loop (false, 10000000); 24 f.wait (); 25 std::cout << value << std::endl; 26 }
运行:
1 SSttaarrtteedd 10 2100000000000000 2 3 10000000 4 5 real 0m0.541s 6 user 0m0.675s 7 sys 0m0.089s
最后结果是正确的,从用时来看比上述的慢点,但是并没有慢太多
5. 互斥锁mutex
1 #include <future> 2 #include <iostream> 3 4 std::mutex mutex; 5 volatile int value = 0; 6 7 int loop (bool inc, int limit) { 8 std::cout << "Started " << inc << " " << limit << std::endl; 9 for (int i = 0; i < limit; ++i) { 10 std::lock_guard<std::mutex> guard (mutex); 11 if (inc) { 12 ++value; 13 } else { 14 --value; 15 } 16 } 17 return 0; 18 } 19 20 int main () { 21 auto f = std::async (std::launch::async, std::bind(loop, true, 20000000)); 22 loop (false, 10000000); 23 f.wait (); 24 std::cout << value << std::endl; 25 }
运行:
1 SSttaarrtteedd 10 2010000000000000 2 3 10000000 4 5 real 0m25.229s 6 user 0m7.011s 7 sys 0m22.667s
互斥锁要比前面几种的慢很多
1 Benchmark 2 Method Time (sec.) 3 No synchronization 0.070 4 LOCK 0.481 5 Atomic 0.457 6 Spinlock 0.541 7 Mutex 22.667
当然,测试结果会依赖于不同的平台和编译器(我是在Mac Air和clang上做的测试)。
原文链接:http://demin.ws/blog/english/2012/05/05/atomic-spinlock-mutex/