Relativistic Programming

Overview

What is RP?

RP stands for "Relativistic Programming." RP is a concurrent programming technique for shared memory, multicore architectures. By default, RP uses a unique temporal frame of reference for each processor's accesses to memory. Communication among processors occurs through explicit RP primitives for publishing and acquiring snapshots of memory locations. Compared to conventional synchronization strategies and memory models this approach makes very little use of atomic read-modify-write instructions or memory barriers. Hence it is able to localize the effects of individual execution sequences within a parallel computation and greatly reduce interference among processors.

How does it relate conceptually to other synchronization techniques?

Synchronization strategies can be categorized according to their method for dealing with concurrent memory accesses. Conventionally, concurrent read and write accesses to the same memory location are viewed as conflicting, as are concurrent write accesses. Concurrent read accesses are non-conflicting. Synchronization strategies based on locking can be classified as conflict prevention strategies since they block competing threads to prevent conflicts. Conversely, non-blocking synchronization strategies and most implementations of transactional memory can be classified as conflict recovery strategies since they allow conflicts to occur but then use roll-back as a recovery mechanism. RP is a conflict tolerant strategy because it tolerates concurrent reads and writes to the same memory location. These conventionally conflicting accesses are tolerated in RP because each processor has its own view of memory. This approach allows reads to be treated as having occurred earlier than concurrent writes unless there is some explicit ordering specified in the program that prevents this. The approach just described is more accurately classified as a read-write conflict tolerant approach. An example of a write-write conflict tolerant approach is split counters. Split counters partition a counter across CPUs and tolerate concurrent writes on different CPUs. Only reads of the counter require access to all CPUs in a coordinated way. Hence, reads are viewed as conflicting with writes, but multiple concurrent writes do not conflict with each other and are unordered.

What challenges does RP introduce?

RP introduces several new challenges. First, the temporal reference frames of different processors require multiple versions of memory to be maintained. In the absence of infinite memory, garbage collection technology is required to recycle old versions of memory. This technology must function in such a way as to sustain the performance and scalability benefits of RP. Specifically, it must not require frequent interprocessor interactions nor use expensive atomic instructions or memory barriers in the conflict tolerant execution sequences. Second, it requires a programming paradigm shift on the part of the concurrent programmer. The programming paradigm is not difficult to master, but it is new. Example uses of RP exist for certain concurrent data types but full generalization of the approach is still an active area of research.

Does it scale well?

RP has extremely good performance and scalability properties. Many uses of RP in the Linux Kernel have resulted in several orders of magnitude performance improvement compared to locking and transactional memory.

Is it easy to program with?

RP is not difficult to program with. Allowing each execution sequence to proceed using its own view of memory, by default, simplifies concurrent programming because it prevents memory from changing spontaneously. Threads are guaranteed to observe coherent memory, i.e., memory will contain values that were actually written at some time in the past. Read paths also exhibit deterministic performance characteristics, since they can not block or retry. This feature simplifies programming of time-sensitive systems. Nevertheless, RP is a new programming paradigm with a new interface and there are several ways to misuse it. Many of the misuses we have observed so far can be caught through static analysis. Development of static analysis tools for RP is an active area of research within our project.

How much real-world use has it seen?

Read Copy Update (RCU), an early example of RP, has seen extensive real-world use in the Linux Kernel. The number of uses of RCU in the Linux Kernel (as of 2.6.23) stands at over 1500 and is growing rapidly. It has been used successfully by numerous developers across several different sub-systems in Linux. It has also seen production use in other operating systems including DYNIX/ptx, IBM's VM hypervisor layer and the Tornado and K42 operating systems. In many cases it has led to substantial performance improvements and code simplification.

People

Collaborators

Papers

"Why The Grass May Not Be Greener On The Other Side: A Comparison of Locking and Transactional Memory," Paul E. McKenney, Maged M. Michael, and Jonathan Walpole In Proceedings of the 4th Workshop on Programming Languages and Operating Systems (PLOS 2007), Skamania Lodge, WA, October 2007. (talk slides )

"Technologies Supporting Realtime Response from Linux on Shared-Memory Multiprocessor Systems," D. Guniguntala, P.E. McKenney, J. Stultz, J. Triplett, and J. Walpole, IBM Systems Journal, Volume 47, Number 2, 2007.

"Performance of Memory Reclamation for Lockless Synchronization" Thomas E Hart, Paul E McKenney, Angela Demke Brown, and Jonathan Walpole, to appear in the Journal of Parallel and Distributed Computing.

"Exploiting Deferred Destruction: An Analysis of Read-Copy-Update Techniques" Paul McKenney, Ph.D. Thesis, OGI School of Science and Engineering, OHSU, July 2004. (presentation)

Sponsors

Mailing list

Mailing list