skip to main content
Fault-tolerant wait-free implementations and robust wait-free hierarchies
Publisher:
  • Cornell University
  • PO Box 250, 124 Roberts Place Ithaca, NY
  • United States
Order Number:UMI Order No. GAX94-09563
Bibliometrics
Skip Abstract Section
Abstract

In shared-memory systems, complex (shared) objects, such as queues and stacks, are implemented in software from simple objects, such as registers and test & sets, which are often supported in hardware. Traditional implementations use lock-based techniques and are consequently not fault-tolerant: if any process crashes while holding the lock, the other processes are effectively prevented from accessing the implemented object. Wait-free implementations, which have been the focus of much recent research, were introduced to overcome this drawback. An implementation is wait-free if every access by a non-faulty process is guaranteed a response, regardless of whether the other processes are slow, fast, or have crashed. This thesis addresses the following two issues concerning wait-free implementations: (1) Shared objects with wait-free implementations tolerate the failure of processes, but not the failure of base (hardware) objects from which they are implemented. We consider the problem of implementing shared objects that tolerate the failure of both processes and base objects.

We identify two classes of object failures: responsive and non-responsive. With responsive failures, a faulty object responds to every operation, but its responses may be incorrect. With non-responsive failures, a faulty object may also "'hang" without responding. In each class, we define crash, omission, and arbitrary modes of failure.

We show that all responsive failure modes can be tolerated. More precisely, for all responsive failure modes ${\cal F}$, object types T, and $t \ge 0$, we show how to implement a shared object of type T which is t-tolerant for ${\cal F}$. Such an object remains correct and wait-free even if up to t base objects fail according to ${\cal F}$. In contrast to responsive failures, we show that even the most benign non-responsive failure mode cannot be tolerated. We also show that randomization can be used to circumvent this impossibility result.

(2) Some objects are stronger than others in their ability to support wait-free implementations. It is thus natural to ask whether objects can be placed in a hierarchy accordingly. We identify robustness as a desirable property of such a hierarchy. Roughly speaking, a hierarchy is robust if no object at a given level has a wait-free implementation using objects at lower levels.

In this thesis, we investigate whether the well-known hierarchy proposed by Herlihy is robust. We prove that, contrary to popular belief, this hierarchy is not robust. Thus, objects at a low level in Herlihy's hierarchy are not necessarily weak: they can be used to implement wait-free objects at higher levels. We therefore propose three natural variants of Herlihy's hierarchy. We prove that two of these are not robust. The robustness of the third is open.

Contributors
  • Dartmouth College

Recommendations