Over the past two decades the research community has developed a body of knowledge concerning “Lock-Free” and “Wait-Free” algorithms and data structures. These techniques allow concurrent update of shared data structures without resorting to critical sections protected by operating system managed locks. A number of wait-free and lock free algorithms for simple data structures such as LIFO stacks and FIFO queues have been published. Lock-free algorithms for more complex data structures such as priority queues, hash tables, sets, and red-black trees are also known.
Some of the most commonly stated benefits of lock-free synchronisation are:
- Efficiency benefits compared to lock-based algorithms for some workloads, including potential scalability benefits on multiprocessor machines.
- Support for concurrent update to shared data structures even when locks aren’t available (e.g. in interrupt handlers etc.)
- Avoidance of priority inversion in real-time systems.
A significant benefit of lock (or wait)-freedom for real-time systems is that by avoiding locks the potential for priority inversion is avoided. Solutions for avoiding priority inversion usually involve special real-time process schedulers. On platforms where a real-time scheduler is not present, lock-free data structures provide an opportunity to sidestep the hazards of interlocking with the scheduler.
With the exception of a uniprocessor implementation of a single-reader single-writer ring buffer FIFO, all the lock-free algorithms which I have encountered require the use of special atomic processor instructions such as CAS (compare and swap) or LL/SC (load linked/store conditional). Furthermore, the correct implementation of these algorithms also requires an understanding of the use of memory barriers to force the order of some memory reads and writes on multiprocessor systems. This is because memory controllers may reorder reads and writes as observed by other processors on an multiprocessor system (or by prehipherals on a uniprocessor system).
Lock free structures in computer music
One example of a context that can benefit from lock-free synchronisation is the desktop digital audio domain. Commodity operating systems such as Microsoft Windows and Macintosh OS-X do not provide real-time schedulers, yet there is often a requirement for concurrent access to shared data structures accross separate threads maintaining a GUI, performing real-time audio rendering, and performing disk and network i/o.
Lock-free data structures are not unknown in the computer music world. For example the venerable single-reader single-writer atomic ring buffer FIFO is found in many systems including PortAudio, PortMidi, and SuperCollider. More complex data structures such as node-based lock free queues are found in MidiShare (see also post to LAD). JACK also cites some lock-free publications although it is not clear if it uses them. Serpent uses lock-free data structures. JSyn uses lock-free FIFOs and singly-linked-lists.
Ken Greenbaum informs me that the SGI AL implementation uses the single-reader, single-writer ring buffer algorithm, and successfully implements this on symmetric MP machines.
In search of an open source library
It would be useful to have a cross-platform library of lock-free primitives for implementing real-time applications on desktop operating systems. Such a library would include implementations of queues and stacks, low level atomic update operations and memory barriers. It may also be useful to include higher level infrastructure, such as a robust implementation of deferred C++ function calls (for example see the single reader, single writer prototype code here). The present author is seeking to find or develop a library released under a permissive open source license allowing use in closed-source applications. These requirements can be summarised as follows:
- Portability with support for (at least) the dominant desktop computing platforms (i.e. PPC, x86, Mac-OSX, Linux, Windows).
- Implementation of common lock-free data structures such as LIFO stack and FIFO queue.
- Licensed under a flexible license permitting use in both open-source and closed-source systems (e.g. LGPL or BSD licensed)
Thus far a library which fulfills the above requirements has not been identified. If you know of such a library, or are interested in contributing to the development of one, please let me know.
Existing source code and libraries.
The following is a list of open-source libraries which provide implementations of lock-free data structures. If you know of one which isn’t listed here, please let me know.
- MidiShare Source Code is available under the GPL license. MidiShare includes implementations of lock-free FIFO queues and LIFO stacks.
- Appcore is an SMP and HyperThread friendly library which uses Lock-free techniques to implement stacks, queues, linked lists and other useful data structures. Appcore appears currently to be for x86 computers running Windows. The licensing terms of Appcore are extremely unclear.
- Noble – a library of non-blocking synchronisation protocols. Implements lock-free stack, queue, singly linked list, snapshots and registers. Noble is distributed under a license which only permits non-commercial academic use.
- lock-free-lib published under the GPL or BSD license (your choice). Includes implementations of software transactional memory, multi-workd CAS primitives, skip lists, binary search trees, and red-black trees. For Alpha, Mips, ia64, x86, PPC, and Sparc.
- Nonblocking multiprocessor/multithread algorithms in C++ (for MSVC/x86) posted by Joshua Scholar to musicdsp.org, and are presumably in the public domain. Included are queue, stack, reference-counted garbage collection, memory allocation, templates for atomic algorithms and types. This code is largely untested. A local mirror is here.
- Larry Bush’s implementation of Valois’ lock-free linked list in C++
- Qprof includes the Atomic_ops library of atomic operations and data structures under an MIT-style license. Only available for Linux at the moment, but there are plans to support other platforms. download available here
- Amino Concurrent Building Blocks provides lock free datastructures and STM for C++ and Java under an Apache Software (2.0) licence.
- I asked the experts on comp.programming.threads about the single-reader single-writer atomic FIFO which can be found in SuperCollider and PortAudio among other places. Here’s here’s what was said. In summary, without memory barriers the code is not SMP safe.
- Bob McGwier sent word that the DttSP Project is used by over 1000 amateur radio operators in their radios every single day. It uses jack ring buffers and non locking code. On Linux it uses the jack library for ring buffers (but used outside of jack). They did their own ring buffer implementation for the Windows port: ringb.c and ringb.h.
Brief literature survey
This is not an exhaustive list, but the hope is that enough core articles are linked to give a good introduction to the field. Most of the resources here can be turned up on google or citeseer with search terms such as “lock free”, “lock-free”, “wait free”, “lock free queue”, “non-blocking”, “atomic fifo”, etc.
- Audio Anecdotes Volume II contains an article named “Introduction to the ring buffer FIFO Queue” which goes into detail of the single-reader single-writer queue and provides an implementation.
- D. Fober, Y. Orlarey, S. Letz at GRAME are the authors of MidiShare, and have published a number of papers and technical reports describing and evaluating lock-free data structures. One of their papers describes the LIFO for x86 used in MidiShare.
- John D. Valois published one of the often cited lock-free queue algorithms (which incidentally had a race condition corrected in Michael and Scott’s “Correction of a Memory Management Method for Lock-Free Data Structures”). You can find Valois’ papers on citeseer, but I don’t have a valid link for his home page or his PhD thesis.
- Maurice P. Herlihy at Brown has published a number of papers regarding lock-free data structures.
- Mark Moir University of Pittsburgh (also at Sun) is currently the Principal Investigator of the Scalable Synchronisation Research Group at Sun Labs.
- James H. Anderson and colleagues at UNC Chapel Hill have published regarding lock-free and wait-free algorithms, with specific reference to real-time systems. Including his PhD student Srikanth Ramamurthy, whose dissertation was entitled A Lock-Free Approach to Object Sharing in Real-Time Systems.
- Philippas Tsigas, Yi Zhang and colleagues at Chalmers University have a number of publications regarding wait-free and lock-free algorithms. Together with Håkan Sundell developed the NOBLE library of non-blocking synchronisation protocols. See also Wait Free Techniques for Real Time Processing.
- Maged M. Michael at TJ Watson (also at URCS) and Michael L. Scott have developed a number of lock-free algorithms including one of the most well known lock-free queue algorithms. Maged Michael also wrote Safe Memory Reclamation for Dynamic Lock-Free Objects Using Atomic Reads and Writes
- Practical lock-free data structures includes a number of publications about lock-free techniques, and includes their lock-free-lib listed above. The University of Cambridge group includes Tim Harris, Keir Fraser, Ian Pratt and Chris Purcell.
- Paul E. McKenney has published regarding Read-Copy Update.
- H. Huang, P. Pillai, and K. G. Shin, “Improving Synchronization-Free Algorithms for Interprocess Communication in Embedded Real-Time Systems,” in Proceedings of USENIX Technical Conference, June 2002.
- Anthony LaMarca: A Performance Evaluation of Lock-Free Synchronization Protocols. PODC 1994: 130-140
- M. Hohmuth and H. Härtig at The Dresden Real-Time Operating System Project have some of publications regarding pragmatic non-blocking synchronisation.
- Wim H. Hesselink has unpublished manuscripts covering validation of Lock-Free algorithms, and an “almost wait-free” resizable hash table.
- Lock-free scheduling of logical processes in parallel simulation, Jason Liu, David M. Nicol, and King Tan, Proceedings of the 15th Workshop on Parallel and Distributed Simulation (PADS’01), Lake Arrowhead, CA, May 15-18, 2001, pp. 22-31.
- Hans Boehm is part of a group which is working on the specification of threads and memory model for the next revision of the C++ specification. His paper Threads Cannot Be Implemented As a Library gives some background.
Other possibly relevant links
- DOME A distributed object oriented execution environment (paper at citeseer)
- Hood A Threads Library for Multiprogrammed Multiprocessors
- Arch is a library of tools for multiprocessor machine and computer cluster programming
- Memory Barriers on Multiprocessor Architectures
- Shared Memory Consistency Models: A Tutorial