WinMagic: Subquery Elimination Using Window Aggregation
Abstract
Database queries often take the form of correlated SQL queries. Correlation refers to the use of values from the outer query block to compute the inner subquery. This is a convenient paradigm for SQL programmers and closely mimics a function invocation paradigm in a typical computer programming language. Queries with correlated subqueries are also often created by SQL generators that translate queries from application domain-specific languages into SQL. Another significant class of queries that use this correlated subquery form is that involving temporal databases using SQL. Performance of these queries is an important consideration particularly in large databases. Several proposals to improve the performance of SQL queries containing correlated subqueries can be found in database literature. One of the main ideas in many of these proposals is to suitably decorrelate the subquery internally to avoid a tuple-at-a-time invocation of the subquery. Magic decorrelation is one method that has been successfully used. Another proposal is to cache the portion of the subquery that is invariant with the changing values of the outer query block. What we propose here is a new technique to handle some typical correlated queries. We go a step further than to simply decorrelate the subquery. By making use of extended window aggregation capabilities, we eliminate redundant access to common tables referenced in the outer query block and the subquery. This technique can be exploited even for non-correlated subqueries. It is possible to get a huge boost in performance for queries that can exploit this technique, which we call WinMagic. This technique was implemented in IBM® DB2® Universal Database™ Version 7 and Version 8. In addition to improving DB2 customer queries that contain aggregation subqueries, it has provided significant improvements in a number of TPCH benchmarks that IBM has published since late in 2001.