Open Commit Protocols Tolerating Commission Failures
Abstract
To ensure atomicity of transactions in distributed systems so-called 2-phase commit 1993 protocols have been proposed. The basic assumption of these protocols is that the processing nodes involved in transactions are “sane,” i.e., they only fail with omission failures, and nodes eventually recover from failures. Unfortunately, this assumption is not realistic for so-called Open Distributed Systems (ODSs), in which nodes may have totally different reliability characteristics. In ODSs, nodes can be classified into trusted nodes (e.g., a banking server) and nontrusted nodes (e.g., a home PC requesting a remote banking service). While trusted nodes are assumed to be sane, nontrusted nodes may fail permanently and even cause commission failures to occur. In this paper, we propose a family of 2PC protocols that tolerate any number of omission failures at trusted nodes and any number of commission and omission failures at nontrusted nodes. The proposed protocols ensure that (at least) the trusted nodes participating in a transaction eventually terminate the transaction in a consistent manner. Unlike Byzantine commit protocols, our protocols do not incorporate mechanisms for achieving Byzantine agreement, which has advantages in terms of complexity: Our protocols have the same or only a slightly higher message complexity than traditional 2PC protocols. © 1993, ACM. All rights reserved.