Database Partitioning in a Cluster of Processors
Abstract
In a distributed database system the partitioning and allocation of the database over the processor nodes of the network can be a critical aspect of the database design effort. In this paper we develop and evaluate algorithms that perform this task in a computationally feasible manner. The network we consider is characterized by a relatively high communication bandwidth, considering the processing and input output capacities in its processors. Such a balance is typical if the processors are connected via busses or local networks. The common constraint that transactions have a specific root node no longer exists, so that there are more distribution choices. However, a poor distribution leads to less efficient computation, higher costs, and higher loads in the nodes or in the communication network so that the system may not be able to handle the required set of transactions. Our approach is to first split the database into fragments which constitute appropriate units for allocation. The fragments to be allocated are selected based on maximal benefit criteria using a greedy heuristic. The assignment to processor nodes uses a first-fit algorithm. The complete algorithm, called GFF, is stated in a procedural form. The complexity of the problem and of its candidate solutions are analyzed and several interesting relationships are proven. Alternate benefit metrics are considered, since the execution cost of the allocation procedure varies by orders of magnitude with the alternatives of benefit evaluation. A mixed benefit evaluation strategy is eventually proposed. A model for evaluation is presented. Two of the strategies are experimentally evaluated, and the reported results support the discussion. The approach should be suitable for other cases where resources have to be allocated subject to resource constraints. © 1985, ACM. All rights reserved.