2011 Reports
Beyond Trending Topics: Real-World Event Identification on Twitter
User-contributed messages on social media sites such as Twitter have emerged as powerful, real-time means of information sharing on the Web. These short messages tend to reflect a variety of events in real time, earlier than other social media sites such as Flickr or YouTube, making Twitter particularly well suited as a source of real-time event content. In this paper, we explore approaches for analyzing the stream of Twitter messages to distinguish between messages about real-world events and non-event messages. Our approach relies on a rich family of aggregate statistics of topically similar message clusters, including temporal, social, topical, and Twitter-centric features. Our large-scale experiments over millions of Twitter messages show the effectiveness of our approach for surfacing real-world event content on Twitter.
Subjects
Files
- cucs-012-11.pdf application/pdf 322 KB Download File
More About This Work
- Academic Units
- Computer Science
- Publisher
- Department of Computer Science, Columbia University
- Series
- Columbia University Computer Science Technical Reports, CUCS-012-11
- Published Here
- July 8, 2011