User community reconstruction using sampled microblogging data
Abstract
User community recognition in social media services is important to identify hot topics or users' interests and concerns in a timely way when a disaster has occurred. In microblogging services, many short messages are posted every day and some of them represent replies or forwarded messages between users. We extract such conversational messages to link the users as a user network and regard the strongly-connected components in the network as indicators of user communities. However, using all of the microblog data for user community extraction is too costly and requires too much storage space when decomposing stronglyconnected components. In contrast, using sampled data may miss some user connections and thus divide one user community into pieces. In this paper, we propose a method for user community reconstruction using the lexical similarity of the messages and the user's link information between separate communities. Copyright is held by the International World Wide Web Conference Committee (IW3C2).