ETree: Effective and efficient event modeling for real-time online social media networks
Abstract
Outline social media networks (OSMNs) such as Twitter provide great opportunities for public engagement and event information dissemination. Event-related discussions occur in real time and at the worldwide scale. However, these discussions are in the form of short, unstructured messages and dynamically woven into daily chats and status updates. Compared with traditional news articles, the rich and diverse user-generated content raises unique new challenges for tracking and analyzing events. Effective and efficient event modeling is thus essential for real-time information-intensive OSMNs. In this work, we propose ETree, an effective and efficient event modeling solution for social media network sites. Targeting the unique challenges of this problem, ETree consists of three key components: (1) an n-gram based content analysis technique for identifying core information blocks from a large number of short messages; (2) an incremental and hierarchical modeling technique for identifying and constructing event theme structures at different granularities; and (3) an enhanced temporal analysis technique for identifying inherent causalities between information blocks. Detailed evaluation using 3.5 million tweets over a 5-month period demonstrates that ETree can efficiently generate high-quality event structures and identify inherent causal relationships with high accuracy. © 2011 IEEE.