Publication
IC2E 2015
Conference paper

SmartCache: An optimized MapReduce implementation of Frequent Itemset Mining

View publication

Abstract

Frequent Itemset Mining (FIM) is a classic data mining topic with many real world applications such as market basket analysis. Many algorithms including Apriori, FP-Growth, and Eclat were proposed in the FIM field. As the dataset size grows, researchers have proposed MapReduce version of FIM algorithms to meet the big data challenge. This paper proposes new improvements to the MapReduce implementation of FIM algorithm by introducing a cache layer and a selective online analyzer. We have evaluated the effectiveness and efficiency of SmartCache via extensive experiments on four public datasets. SmartCache can reduce on average 45.4%, and up to 97.0% of the total execution time compared with the state-of-the-art solution.

Date

Publication

IC2E 2015

Authors

Share