Rectangle-efficient aggregation in spatial data streams
Abstract
We consider the estimation of aggregates over a data stream of multidimensional axis-aligned rectangles. Rectangles are a basic primitive object in spatial databases, and efficient aggregation of rectangles is a fundamental task. The data stream model has emerged as a de facto model for processing massive databases in which the data resides in external memory or the cloud and is streamed through main memory. For a point p, let n(p) denote the sum of the weights of all rectangles in the stream that contain p. We give near-optimal solutions for basic problems, including (1) the k-th frequency moment F k = Σ points p|n(p)| k, (2) the counting version of stabbing queries, which seeks an estimate of n(p) given p, and (3) identification of heavy-hitters, i.e., points p for which n(p) is large. An important special case of F k is F 0, which corresponds to the volume of the union of the rectangles. This is a celebrated problem in computational geometry known as "Klee's measure problem", and our work yields the first solution in the streaming model for dimensions greater than one. © 2012 ACM.