Limits of parallelism in hash join algorithms
Abstract
The performance of parallel hash join algorithms is analyzed in an environment where several join queries are running concurrently. Analytical models for predicting the throughput and response time of join queries are developed. We consider two important parallel join algorithms: hybrid hash and Grace join. The effect of skew on the performance of these algorithms is examined. Results based on the analytical models, as well as simulation results, are presented. Some of the results obtained are quite unusual. For instance, in the case of the hybrid hash algorithm, we show that, under heavy load, the response time versus degree of parallelism curve can have two local minima. We establish a simple rule of thumb for choosing the degree of parallelism in order to maximize the throughput of the hybrid hash algorithm. In the case of Grace join, we derive asymptotic conditions on the amount of skew for a limit on parallelism to exist. © 1994.