PICS: Parameter-free identification of cohesive subgroups in large attributed graphs
Abstract
Given a graph with node attributes, how can we find meaningful patterns such as clusters, bridges, and outliers? Attributed graphs appear in real world in the form of social networks with user interests, gene interaction networks with gene expression information, phone call networks with customer demographics, and many others. In effect, we want to group the nodes into clusters with similar connectivity and homogeneous attributes. Most existing graph clustering algorithms either consider only the connectivity structure of the graph and ignore the node attributes, or require several user-defined parameters such as the number of clusters. We propose PICS, a novel, parameter-free method for mining at- Tributed graphs. Two key advantages of our method are that (1) it requires no user-specified parameters such as the number of clusters and similarity functions, and (2) its running time scales linearly with total graph and attribute size. Our experiments show that PICS reveals meaningful and insightful patterns and outliers in both synthetic and real datasets, including call networks, political books, political blogs, and collections from Twitter and YouTube which have more than 70K nodes and 30K attributes. Copyright © 2012 by the Society for Industrial and Applied Mathematics.