Selectively materializing data in mediators by analyzing user queries
Abstract
There is currently great interest in building information mediators that can integrate information from multiple data sources such as databases or Web sources. The query response time for such mediators is typically quite high, mainly due to the time spent in retrieving data from remote sources. We present an approach for optimizing the performance of information mediators by selectively materializing data. We first present our overall framework for materialization in a mediator environment. The data is materialized selectively. We outline the factors that are considered in selecting data to materialize. We present an algorithm for identifying classes of data to materialize by analyzing one of the factors which is the distribution of user queries. We present results with an implemented version of our optimization system for the Ariadne information mediator, which show the effectiveness of our algorithm in extracting patterns of frequently accessed classes from user queries. We also demonstrate the effectiveness of approach in optimizing mediator performance by materializing such classes.