Decision-support workload characteristics on a clustered database server from the OS perspective
Abstract
A range of database services are being offered on clusters of workstations today to meet the demanding needs of applications with voluminous datasets, high computational and I/O requirements and a large number of users. The underlying database engine runs on cost-effective off-the-shelf hardware and software components that may not really be tailored/tuned for these applications. At the same time, many of these databases have legacy codes that may not be easy to modulate based on the evolving capabilities and limitations of clusters. An indepth understanding of the interaction between these database engines and the underlying operating system (OS) can identify a set of characteristics that would be extremely valuable for future research on systems support for these environments. To our knowledge, there is no prior work that has embarked on such a characterization for a clustered database server. Using IBM DB2 Universal Database (UDB) Extended Enterprise Edition (EEE) V7.2 Trial version and TPC-H like decision support queries, this paper studies numerous issues by evaluating performance on an off-the-shelf Pentium/Linux cluster connected by Myrinet. These include detailed performance profiles of all kernel activities, as well as qualitative and quantitative insights on the interaction between the database engine and the operating system.