Boosting collaborative filtering based on statistical prediction errors
Abstract
User-based collaborative filtering methods typically predict a user's item ratings as a weighted average of the ratings given by similar users, where the weight is proportional to the user similarity. Therefore, the accuracy of user similar-ity is the key to the success of the recommendation, both for selecting neighborhoods and computing predictions. How-ever, the computed similarities between users are somewhat inaccurate due to data sparsity. For a given user, the set of neighbors selected for predicting ratings on different items typically exhibit overlap. Thus, error terms contributing to rating predictions will tend to be shared, leading to correlation of the prediction errors. Through a set of case studies, we discovered that for a given user, the prediction errors on different items are correlated to the similarities of the corresponding items, and to the degree to which they share common neighbors. We propose a framework to improve prediction accuracy based on these statistical prediction errors. Two different strategies to estimate the prediction error on a desired item are proposed. Our experiments show that these approaches improve the prediction accuracy of standard user-based methods significantly, and they outperform other state-of-the-art methods. © 2008 ACM.