Efficient btree based indexing for cloud data processing

Sai Wu; Dawei Jiang; Beng Chin Ooi; Kunlung Wu

doi:10.14778/1920841.1920991

Publication

VLDB

Paper

Efficient btree based indexing for cloud data processing

VLDB

View publication

Abstract

A Cloud may be seen as a type of flexible computing infrastructure consisting of many compute nodes, where resizable computing capacities can be provided to different customers. To fully harness the power of the Cloud, efficient data management is needed to handle huge volumes of data and support a large number of concurrent end users. To achieve that, a scalable and high-throughput indexing scheme is generally required. Such an indexing scheme must not only incur a low maintenance cost but also support parallel search to improve scalability. In this paper, we present a novel, scalable B+-tree based indexing scheme for efficient data processing in the Cloud. Our approach can be summarized as follows. First, we build a local B+-tree index for each compute node which only indexes data residing on the node. Second, we organize the compute nodes as a structured overlay and publish a portion of the local B+-tree nodes to the overlay for efficient query processing. Finally, we propose an adaptive algorithm to select the published B+-tree nodes according to query patterns. We conduct extensive experiments on Amazon's EC2, and the results demonstrate that our indexing scheme is dynamic, efficient and scalable. © 2010 VLDB Endowment.

Date

01 Jan 2010

Publication

VLDB

Authors

IBM-affiliated at time of publication

Topics

Computer Science

Abstract

Date

Publication

Authors

Topics

Share