Lock-free vs wait-free concurrency


看leveldb源码中遇到的,关于lock-free 和 wait-free..感觉这个讲得不错,我试着翻译一下?

There are two types of non-blocking thread synchronization algorithms – lock-free, and wait-free. Their meaning is often confused. In lock-free systems, while any particular computation may be blocked for some period of time, all CPUs are able to continue performing other computations. To put it differently, while a given thread might be blocked by other threads in a lock-free system, all CPUs can continue doing other useful work without stalls. Lock-free algorithms increase the overall throughput of a system by occassionally increasing the latency of a particular transaction. Most high- end database systems are based on lock-free algorithms, to varying degrees.

有两种 无阻塞线程同步算法,一种是lock-free,一种是wait-free,它们的含义经常被搞混。在一个lock-free系统中,尽管某个特定的计算会被阻碍一段时间,所有的cpu还是能够继续其他计算。换一种说法,尽管在一个lock-free的系统里,一个线程可能被其他线程阻碍,所有的cpu仍然可以继续做其他工作而不做停顿。lock-free算法通过偶然增加一个特定事物的延迟,增加了系统总体的吞吐率。大多数高端数据库系统都在某种程度上基于lock-free 算法。


By contrast, wait-free algorithms ensure that in addition to all CPUs continuing to do useful work, no computation can ever be blocked by another computation. Wait-free algorithms have stronger guarantees than lock-free algorithms, and ensure a high thorughput without sacrificing latency of a particular transaction. They’re also much harder to implement, test, and debug. The lockless page cachepatches to the Linux kernel are an example of a wait-free system.



In a situation where a system handles dozens of concurrent transactions and has soft latency requirements, lock-free systems are a good compromise between development complexity and high concurrency requirements. A database server for a website is a good candidate for a lock-free design. While any given transaction might block, there are always more transactions to process in the meantime, so the CPUs will never stay idle. The challenge is to build a transaction scheduler that maintains a good mean latency, and a well bounded standard deviation.


In a scenario where a system has roughly as many concurrent transactions as CPU cores, or has hard real-time requirements, the developers need to spend the extra time to build wait-free systems. In these cases blocking a single transaction isn’t acceptable – either because there are no other transactions for the CPUs to handle, minimizing the throughput, or a given transaction needs to complete with a well defined non-probabilistic time period. Nuclear reactor control software is a good candidate for wait-free systems.



作者: CrazyKK