Clustering crowds
Abstract
We present a clustered personal classifier method (CPC method) that jointly estimates a classifier and clusters of workers in order to address the learning from crowds problem. Crowdsourcing allows us to create a large but low-quality data set at very low cost. The learning from crowds problem is to learn a classifier from such a lowquality data set. From some observations, we notice that workers form clusters according to their abilities. Although such a fact was pointed out several times, no method has applied it to the learning from crowds problem. We propose a CPC method that utilizes the clusters of the workers to improve the performance of the obtained classifier, where both the classifier and the clusters of the workers are estimated. The proposed method has two advantages. One is that it realizes robust estimation of the classifier because it utilizes prior knowledge about the workers that they tend to form clusters. The other is that we can obtain the clusters of the workers, which help us analyze the properties of the workers. Experimental results on synthetic and real data sets indicate that the proposed method can estimate the classifier robustly. In addition, clustering workers is shown to work well. Especially in the real data set, an outlier worker was found by applying the proposed method. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.