Resource constrained stream mining with classifier tree topologies
Abstract
Stream mining applications require the identification of several different attributes in data content and hence rely on a distributed set of cascaded statistical classifiers to filter and process the data dynamically. In this letter, we introduce a novel methodology for configuring cascaded classifier topologies, specifically binary classifier trees, with optimized operating points after jointly considering the misclassification cost of each end-to-end class of interest in the tree, the resource constraints for every classifier, and the confidence level of each data object that is classified. By configuring multiple operating points per classifier, we enable not only intelligent load shedding when resources are scarce but also intelligent replication of low confidence data across multiple edges when excess resources are available. Using a classifier tree constructed from support vector machine-based sports image classifiers, we verify huge cost savings and discuss how different classifier placements and costs can influence the gains obtained by various algorithms. © 2008 IEEE.