research-article

Can We Predict a Riot? Disruptive Event Detection Using Twitter

Authors:
Nasser Alsaedi

Cardiff University, UK

Cardiff University, UK
View Profile

,
Pete Burnap

Cardiff University, UK

Cardiff University, UK
View Profile

,
Omer Rana

Cardiff University, UK

Cardiff University, UK
View Profile

Authors Info & Claims

ACM Transactions on Internet Technology Volume 17 Issue 2Article No.: 18pp 1–26https://doi.org/10.1145/2996183

Published:27 March 2017Publication History

ACM Transactions on Internet Technology

Abstract

In recent years, there has been increased interest in real-world event detection using publicly accessible data made available through Internet technology such as Twitter, Facebook, and YouTube. In these highly interactive systems, the general public are able to post real-time reactions to “real world” events, thereby acting as social sensors of terrestrial activity. Automatically detecting and categorizing events, particularly small-scale incidents, using streamed data is a non-trivial task but would be of high value to public safety organisations such as local police, who need to respond accordingly. To address this challenge, we present an end-to-end integrated event detection framework that comprises five main components: data collection, pre-processing, classification, online clustering, and summarization. The integration between classification and clustering enables events to be detected, as well as related smaller-scale “disruptive events,” smaller incidents that threaten social safety and security or could disrupt social order. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely temporal, spatial, and textual content. We evaluate our framework on a large-scale, real-world dataset from Twitter. Furthermore, we apply our event detection system to a large corpus of tweets posted during the August 2011 riots in England. We use ground-truth data based on intelligence gathered by the London Metropolitan Police Service, which provides a record of actual terrestrial events and incidents during the riots, and show that our system can perform as well as terrestrial sources, and even better in some cases.

References

Fabian Abel, Claudia Hauf, Geert Houben, Richard Stronkman, and Ke Tao. 2012. Twitcident: Fighting fire with information from social web streams. In Proceedings of the 21st International Conference on World Wide Web (WWW’14 Companion). ACM, 305--308. Google ScholarDigital Library
Manoj K. Agarwal, Krithi Ramamritham, and Manish Bhide. 2012. Real time discovery of dense clusters in highly dynamic graphs: Identifying real world events in highly dynamic environments. Proc. VLDB Endow. 5, 10 (June 2012), 980--991. Google ScholarDigital Library
Nasser Alsaedi and Pete Burnap. 2015. Arabic event detection in social media. In Proceedings of the 16th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing’15). 384--401.Google ScholarCross Ref
Nasser Alsaedi, Pete Burnap, and Omer Rana. 2014. A combined classification-clustering framework for identifying disruptive events. In Proceedings of the 6th ASE International Conference on Social Computing (SocialCom’14).Google Scholar
Nasser Alsaedi, Pete Burnap, and Omer Rana. 2015. Identifying disruptive events from social media to enhance situational awareness. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’15). Google ScholarDigital Library
Hila Becker, Mor Naaman, and Luis Gravano. 2011a. Beyond trending topics: Real-world event identification on twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM’11).Google Scholar
Hila Becker, Mor Naaman, and Luis Gravano. 2011b. Selecting quality twitter content for events. In Proceedings of the 5th International Conference on Weblogs and Social Media.Google Scholar
Edward Benson, Aria Haghighi, and Regina Barzilay. 2011. Event discovery in social media feeds. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (HLT’11). Association for Computational Linguistics, Stroudsburg, PA, 389--398. Google ScholarDigital Library
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Mach. Learn. Res. 3 (March 2003), 993--1022. Google ScholarDigital Library
Alexander Boettcher and Dongman Lee. 2012. EventRadar: A real-time local event detection scheme using twitter stream. In Proceedings of the 2012 IEEE International Conference on Green Computing and Communications (GreenCom). 358--367. Google ScholarDigital Library
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 30, 1--7 (1998), 107--117. Google ScholarDigital Library
Pete Burnap, Amir Javed, Omer Rana, and Malik Shahzad Awan. 2015. Real-time classification of malicious URLs on twitter using machine activity data. In Proceedings of the 2015 ACM International Conference on Advances in Social Networks Analysis and Mining (SNAM’15). ACM, New York, NY. Google ScholarDigital Library
Pete Burnap, Matthew Williams, Luke Sloan, Omer Rana, William Housley, Adam Edwards, Vincent Knight, Rob Procter, and Alex Voss. 2014. Tweeting the terror: Modelling the social media reaction to the Woolwich terrorist attack. Soc. Netw. Anal. Min. 4 (2014), 206.Google ScholarCross Ref
Soudip Roy Chowdhury, Muhammad Imran, Muhammad Rizwan Asghar, Sihem Amer-Yahia, and Carlos Castillo. 2013. Tweet4act: Using incident-specific profiles for classifying crisis-related messages. In Proceedings of the 10th International Conference on Information Systems for Crisis Response and Management (ISCRAM’10).Google Scholar
Freddy Chong Tat Chua and Sitaram Asur. 2013. Automatic summarization of events from social media. In Proceedings of the Seventh International Conference on Weblogs and Social Media (ICWSM 2013).Google Scholar
Mário Cordeiro. 2012. Twitter event detection: Combining wavelet analysis and topic inference summarization. In Doctoral Symposium on Informatics Engineering, DSIE.Google Scholar
Bruce Croft, Donald Metzler, and Trevor Strohman. 2009. Search Engines: Information Retrieval in Practice (1st ed.). Addison-Wesley. Google ScholarDigital Library
Mona Diab, Kadri Hacioglu, and Daniel Jurafsky. 2004. Automatic tagging of arabic text: From raw text to base phrase chunks. In Proceedings of HLT-NAACL 2004: Short Papers (HLT-NAACL-Short’04). Association for Computational Linguistics, Stroudsburg, PA, 149--152. Google ScholarDigital Library
Xiaowen Dong, Dimitrios Mavroeidis, Francesco Calabrese, and Pascal Frossard. 2015. Multiscale event detection in social media. Data Min. Knowl. Discov. 29, 5 (2015), 1374--1405. Google ScholarDigital Library
Gunes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 1 (2004), 457--479. Google ScholarCross Ref
Atefeh Farzindar and Khreich Wael. 2015. A survey of techniques for event detection in twitter. Comput. Intell. 31, 1 (Feb. 2015), 132--164. Google ScholarDigital Library
Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 1998. Additive logistic regression: A statistical view of boosting. Ann. Stat. 28 (1998), 2000.Google Scholar
Brent Hecht, Lichan Hong, Bongwon Suh, and Ed H. Chi. 2011. Tweets from justin bieber’s heart: The dynamics of the location field in user profiles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). ACM, New York, NY, 237--246. Google ScholarDigital Library
Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing social media messages in mass emergency: A survey. ACM Comput. Surv. 47, 4, Article 67 (June 2015), 38 pages. Google ScholarDigital Library
Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. 2014. AIDR: Artificial intelligence for disaster response. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14 Companion). ACM, 159--162. Google ScholarDigital Library
Akshaya Iyengar, Tim Finin, and Anupam Joshi. 2011. Content-based prediction of temporal boundaries for events in twitter. In Proceedings of the 3rd IEEE International Conference on Social Computing. 186--191.Google ScholarCross Ref
Thorsten Joachims. 1998. Text categorization with suport vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning (ECML’98). Springer-Verlag, London, UK, 137--142. Google ScholarDigital Library
George Karypis, Rajat Aggarwal, Vipin Kumar, and Shashi Shekhar. 1997. Multilevel hypergraph partitioning: Application in VLSI domain. In Proceedings of the 34th Annual Design Automation Conference (DAC’97). ACM, New York, NY, 526--529. Google ScholarDigital Library
David D. Lewis. 1998. Naive (bayes) at forty: The independence assumption in information retrieval. In Proceedings of the 10th European Conference on Machine Learning (ECML’98). Springer-Verlag, London, UK, 4--15. Google ScholarDigital Library
Rui Li, Kin Hou Lei, Ravi Khadiwala, and Kevin Chen-Chuan Chang. 2012. TEDAS: A twitter-based event detection and analysis system. In ICDE. IEEE Computer Society, 1273--1276. Google ScholarDigital Library
Yue Lu, ChengXiang Zhai, and Neel Sundaresan. 2009. Rated aspect summarization of short comments. In Proceedings of the 18th International Conference on World Wide Web (WWW’09). ACM, New York, NY, 131--140. Google ScholarDigital Library
Zongyang Ma, Aixin Sun, and Gao Cong. 2013. On predicting the popularity of newly emerging hashtags in twitter. J. Assoc. Inf. Sci. Technol. 64, 7 (2013), 1399--1410.Google ScholarCross Ref
Adam Marcus, Michael S. Bernstein, Osama Badar, David R. Karger, Samuel Madden, and Robert C. Miller. 2011. Twitinfo: Aggregating and visualizing microblogs for event exploration. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). ACM, New York, NY, 227--236. Google ScholarDigital Library
Michael Mathioudakis and Nick Koudas. 2010. TwitterMonitor: Trend detection over the twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, New York, NY, 1155--1158. Google ScholarDigital Library
Donald Metzler, Congxing Cai, and Eduard Hovy. 2012. Structured event retrieval over microblog archives. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’12). Association for Computational Linguistics, Stroudsburg, PA, 646--655. Google ScholarDigital Library
Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP’04). Association for Computational Linguistics, 404--411.Google Scholar
Pabitra Mitra, C. A. Murthy, and Sankar K. Pal. 2002. Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24, 3 (March 2002), 301--312. Google ScholarDigital Library
United kingdom Metropolitan Police Service MPS. 2012. 4 Days in August: Strategic Review into the Disorder of August 2011 - final report. Retrieved January 1, 2016 from http://www.met.police.uk/foi/pdfs/priorities_and_how_we_are_doing/corpo rate/4_days_in_august.pdf.Google Scholar
Jeffrey Nichols, Jalal Mahmud, and Clemens Drews. 2012. Summarizing sporting events using twitter. In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces (IUI’12). ACM, New York, NY, 189--198. Google ScholarDigital Library
Yukio Ohsawa, Nels E. Benson, and Masahiko Yachida. 1998. KeyGraph: Automatic indexing by co-occurrence graph based on building construction metaphor. In Proceedings of the Advances in Digital Libraries Conference (ADL’98). IEEE Computer Society, Washington, DC, 12--. Google ScholarDigital Library
Andrei Olariu. 2014. Efficient online summarization of microblogging streams. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014). 236--240.Google ScholarCross Ref
Alexandra Olteanu, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2014. CrisisLex: A lexicon for collecting and filtering microblogged communications in crises. In Proceedings of the 8th International Conference on Weblogs and Social Media (ICWSM 2014).Google Scholar
Saša Petrović Miles Osborne, Richard McCreadie, Craig Macdonald, Iadh Ounis, and Luke Shrimptonand. 2013. Can twitter replace newswire for breaking news? In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media (ICWSM’13).Google Scholar
Chi-Chun Pan and Prasenjit Mitra. 2011. Event detection with spatial latent dirichlet allocation. In Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries (JCDL’11). ACM, New York, NY, 349--358. Google ScholarDigital Library
Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming first story detection with application to twitter. In Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT’10). Association for Computational Linguistics, Stroudsburg, PA, 181--189. Google ScholarDigital Library
Swit Phuvipadawat and Tsuyoshi Murata. 2010. Breaking news detection and tracking in twitter. In Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). 120--123. Google ScholarDigital Library
M. F. Porter. 1997. An algorithm for suffix stripping. In Readings in Information Retrieval, Karen Sparck Jones and Peter Willett (Eds.). Morgan Kaufmann, San Francisco, CA, 313--316. Google ScholarDigital Library
Dragomir R. Radev, Sasha Blair-Goldensohn, and Zhu Zhang. 2001. Experiments in single and multidocument summarization using MEAD. First Document Understanding Conference (2001).Google Scholar
Joel W. Reed, Yu Jiao, Thomas E. Potok, Brian A. Klump, Mark T. Elmore, and Ali R. Hurson. 2006. TF-ICF: A new term weighting scheme for clustering dynamic data streams. In Proceedings of the 5th International Conference on Machine Learning and Applications (ICMLA’06). IEEE Computer Society, Washington, DC, 258--263. Google ScholarDigital Library
Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24, 5 (Aug. 1988), 513--523. Google ScholarDigital Library
Hassan Sayyadi and Louiqa Raschid. 2013. A graph analytical approach for topic detection. ACM Trans. Internet Technol. 13, 2, Article 4 (Dec. 2013), 23 pages. Google ScholarDigital Library
Emmanouil Schinas, Georgios Petkos, Symeon Papadopoulos, and Y. Kompatsiaris. 2012. CERTH @ mediaeval 2012 social event detection task. In Proceedings of the MediaEval 2012 Workshop. 6--7.Google Scholar
Axel Schulz, Benedikt Schmidt, and Thorsten Strufe. 2015. Small-scale incident detection based on microposts. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (HT’15). ACM, New York, NY, 3--12. Google ScholarDigital Library
David A. Shamma, Lyndon Kennedy, and Elizabeth F. Churchill. 2010. Tweetgeist: Can the twitter timeline reveal the structure of broadcast events?, Horizon, in CSCW 2010 (2010).Google Scholar
Beaux Sharifi, Mark-Anthony Hutton, and Jugal Kalita. 2010. Summarizing microblogs automatically. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT’10). Association for Computational Linguistics, Stroudsburg, PA, 685--688. Google ScholarDigital Library
Chao Shen, Fei Liu, Fuliang Weng, and Tao Li. 2013. A participant-based approach for event summarization using twitter streams. In Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics. 1152--1162.Google Scholar
Kate Starbird and Leysia Palen. 2012. (How) will the revolution be retweeted?: Information diffusion and the 2011 egyptian uprising. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW’12). ACM, New York, NY, 7--16. Google ScholarDigital Library
Nicholas A. Thapen, Donal Stephen Simmie, and Chris Hankin. 2015. The early bird catches the term: Combining twitter and news data for event detection and situational awareness. CoRR abs/1504.02335 (2015).Google Scholar
Mike Thelwall, Kevan Buckley, and Georgios Paltoglou. 2011. Sentiment in twitter events. J. Am. Soc. Inf. Sci. Technol. 62, 2 (Feb. 2011), 406--418. Google ScholarDigital Library
Lucy Vanderwende, Hisami Suzuki, Chris Brockett, and Ani Nenkova. 2007. Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manage. 43, 6 (Nov. 2007), 1606--1618. Google ScholarDigital Library
Konstantinos N. Vavliakis, Andreas L. Symeonidis, and Pericles A. Mitkas. 2013. Event identification in web social media through named entity recognition and topic modeling. Data Knowl. Eng. 88 (2013), 1--24. Google ScholarDigital Library
Sarah Vieweg, Carlos Castillo, and Muhammad Imran. 2014. Integrating social media communications into the rapid assessment of sudden onset disasters. In Proceedings of the 6th International Conference on Social Informatics. 444--461.Google ScholarCross Ref
Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia Palen. 2010. Microblogging during two natural hazards events: What twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’10). ACM, New York, NY, 1079--1088. Google ScholarDigital Library
Maximilian Walther and Michael Kaisser. 2013. Geo-spatial event detection in the twitter stream. In Proceedings of the 35th European Conference on Advances in Information Retrieval (ECIR’13). Springer-Verlag, Berlin, 356--367. Google ScholarDigital Library
Kazufumi Watanabe, Masanao Ochi, Makoto Okabe, and Rikio Onai. 2011. Jasmine: A real-time local-event detection system based on geolocation information propagated to microblogs. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). ACM, New York, NY, 2541--2544. Google ScholarDigital Library
Jianshu Weng and Bu-Sung Lee. 2011. Event detection in twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM’11).Google Scholar
Matthew Williams and Pete Burnap. 2015. Cyberhate on social media in the aftermath of Woolwich: A case study in computational criminology and big data. Br. J. Criminol. (2015), 1--28.Google Scholar
Wei Xu, Ralph Grishman, Adam Meyers, and Alan Ritter. 2013. A preliminary study of tweet summarization using information extraction. In Proceedings of the Conference of the Association of Computational Linguistics and Workshop on Language in Social Media (LASM’13). 20--29.Google Scholar
Duan Yajuan, Chen Zhumin, Wei Furu, Zhou Ming, and Heung Y. Shum. 2012. Twitter topic summarization by ranking tweets using social influence and content quality. In Proceedings of the 24th International Conference on Computational Linguistics (COLING’12). 763--780.Google Scholar
Xintian Yang, Amol Ghoting, Yiye Ruan, and Srinivasan Parthasarathy. 2012. A framework for summarizing and analyzing twitter feeds. In Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’12). ACM, 370--378. Google ScholarDigital Library
Jie Yin, Sarvnaz Karimi, Andrew Lampert, Mark A. Cameron, Bella Robinson, and Robert Power. 2015. Using social media to enhance emergency situation awareness: Extended abstract. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, IJCAI. 4234--4239. Google ScholarDigital Library
Arkaitz Zubiaga, Damiano Spina, Enrique Amigó, and Julio Gonzalo. 2012. Towards real-time summarization of scheduled events from twitter streams. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT’12). ACM, New York, NY, 319--320. Google ScholarDigital Library

Index Terms

Can We Predict a Riot? Disruptive Event Detection Using Twitter
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Online Bursty Event Detection from Microblog
UCC '14: Proceedings of the 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing

Microblogs (e.g., Twitter and Weibo) have become a large social media platform for users to share contents, their interests and events with friends. A surge of the number of event related posts always reflects that some people's concern real-life events ...
Read More
Real-Time Entity-Based Event Detection for Twitter
CLEF'15: Proceedings of the 6th International Conference on Experimental IR Meets Multilinguality, Multimodality, and Interaction - Volume 9283

In recent years there has been a surge of interest in using Twitter to detect real-world events. However, many state-of-the-art event detection approaches are either too slow for real-time application, or can detect only specific types of events ...
Read More
Building a large-scale corpus for evaluating event detection on twitter
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Despite the popularity of Twitter for research, there are very few publicly available corpora, and those which are available are either too small or unsuitable for tasks such as event detection. This is partially due to a number of issues associated ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Internet Technology Volume 17, Issue 2
Special Issue on Advances in Social Computing and Regular Papers
May 2017
249 pages
ISSN:1533-5399
EISSN:1557-6051
DOI:10.1145/3068849
Editor:
Munindar P. Singh
Department of Computer Science, North Carolina State University
Issue’s Table of Contents
Copyright © 2017 Owner/Author
This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 March 2017
- Accepted: 1 September 2016
- Revised: 1 July 2016
- Received: 1 March 2016
Published in toit Volume 17, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Social media
classification
clustering
evaluation
event detection
feature selection
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 105
  Total Citations
  View Citations
- 2,682
  Total Downloads
- Downloads (Last 12 months)82
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Can We Predict a Riot? Disruptive Event Detection Using Twitter

ACM Transactions on Internet Technology

Abstract

References

Cited By

Index Terms

Recommendations

Online Bursty Event Detection from Microblog

Real-Time Entity-Based Event Detection for Twitter

Building a large-scale corpus for evaluating event detection on twitter