A performance study of three high availability data replication strategies
Abstract
Several data replication strategies have been proposed to provide high data availability for database applications. However, the trade-offs among the different strategies for various workloads and different operating modes have not been studied before. In this paper, we study the relative performance of three high availability data replication strategies, chained declustering, mirrored disks, and interleaved declustering, in a shared nothing database machine environment. In particular, we have examined (1) the relative performance of the three strategies when no failures have occurred, (2) the effect of load imbalance caused by a disk or processor failure on system throughput and response time, and (3) the tradeoff between the benefit of intra query parallelism and the overhead of activating and scheduling extra operator process. Experimental results obtained from a simulation study indicate that, in the normal mode of operation, chained declustering and interleaved declustering perform comparably. Both perform better than mirrored disks if an application is I/O bound, but slightly worse than mirrored disks if the application is CPU bound. In the event of a disk failure, because chained declustering is able to balance the workload among all remaining operational disks while the other two cannot, it provides noticeably better performance than interleaved declustering and much better performance than mirrored disks. © 1993 Kluwer Academic Publishers.