skip to main content
research-article

Can We Predict a Riot? Disruptive Event Detection Using Twitter

Published:27 March 2017Publication History
Skip Abstract Section

Abstract

In recent years, there has been increased interest in real-world event detection using publicly accessible data made available through Internet technology such as Twitter, Facebook, and YouTube. In these highly interactive systems, the general public are able to post real-time reactions to “real world” events, thereby acting as social sensors of terrestrial activity. Automatically detecting and categorizing events, particularly small-scale incidents, using streamed data is a non-trivial task but would be of high value to public safety organisations such as local police, who need to respond accordingly. To address this challenge, we present an end-to-end integrated event detection framework that comprises five main components: data collection, pre-processing, classification, online clustering, and summarization. The integration between classification and clustering enables events to be detected, as well as related smaller-scale “disruptive events,” smaller incidents that threaten social safety and security or could disrupt social order. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely temporal, spatial, and textual content. We evaluate our framework on a large-scale, real-world dataset from Twitter. Furthermore, we apply our event detection system to a large corpus of tweets posted during the August 2011 riots in England. We use ground-truth data based on intelligence gathered by the London Metropolitan Police Service, which provides a record of actual terrestrial events and incidents during the riots, and show that our system can perform as well as terrestrial sources, and even better in some cases.

References

  1. Fabian Abel, Claudia Hauf, Geert Houben, Richard Stronkman, and Ke Tao. 2012. Twitcident: Fighting fire with information from social web streams. In Proceedings of the 21st International Conference on World Wide Web (WWW’14 Companion). ACM, 305--308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Manoj K. Agarwal, Krithi Ramamritham, and Manish Bhide. 2012. Real time discovery of dense clusters in highly dynamic graphs: Identifying real world events in highly dynamic environments. Proc. VLDB Endow. 5, 10 (June 2012), 980--991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nasser Alsaedi and Pete Burnap. 2015. Arabic event detection in social media. In Proceedings of the 16th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing’15). 384--401.Google ScholarGoogle ScholarCross RefCross Ref
  4. Nasser Alsaedi, Pete Burnap, and Omer Rana. 2014. A combined classification-clustering framework for identifying disruptive events. In Proceedings of the 6th ASE International Conference on Social Computing (SocialCom’14).Google ScholarGoogle Scholar
  5. Nasser Alsaedi, Pete Burnap, and Omer Rana. 2015. Identifying disruptive events from social media to enhance situational awareness. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Hila Becker, Mor Naaman, and Luis Gravano. 2011a. Beyond trending topics: Real-world event identification on twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM’11).Google ScholarGoogle Scholar
  7. Hila Becker, Mor Naaman, and Luis Gravano. 2011b. Selecting quality twitter content for events. In Proceedings of the 5th International Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  8. Edward Benson, Aria Haghighi, and Regina Barzilay. 2011. Event discovery in social media feeds. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (HLT’11). Association for Computational Linguistics, Stroudsburg, PA, 389--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Mach. Learn. Res. 3 (March 2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Alexander Boettcher and Dongman Lee. 2012. EventRadar: A real-time local event detection scheme using twitter stream. In Proceedings of the 2012 IEEE International Conference on Green Computing and Communications (GreenCom). 358--367. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 30, 1--7 (1998), 107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Pete Burnap, Amir Javed, Omer Rana, and Malik Shahzad Awan. 2015. Real-time classification of malicious URLs on twitter using machine activity data. In Proceedings of the 2015 ACM International Conference on Advances in Social Networks Analysis and Mining (SNAM’15). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Pete Burnap, Matthew Williams, Luke Sloan, Omer Rana, William Housley, Adam Edwards, Vincent Knight, Rob Procter, and Alex Voss. 2014. Tweeting the terror: Modelling the social media reaction to the Woolwich terrorist attack. Soc. Netw. Anal. Min. 4 (2014), 206.Google ScholarGoogle ScholarCross RefCross Ref
  14. Soudip Roy Chowdhury, Muhammad Imran, Muhammad Rizwan Asghar, Sihem Amer-Yahia, and Carlos Castillo. 2013. Tweet4act: Using incident-specific profiles for classifying crisis-related messages. In Proceedings of the 10th International Conference on Information Systems for Crisis Response and Management (ISCRAM’10).Google ScholarGoogle Scholar
  15. Freddy Chong Tat Chua and Sitaram Asur. 2013. Automatic summarization of events from social media. In Proceedings of the Seventh International Conference on Weblogs and Social Media (ICWSM 2013).Google ScholarGoogle Scholar
  16. Mário Cordeiro. 2012. Twitter event detection: Combining wavelet analysis and topic inference summarization. In Doctoral Symposium on Informatics Engineering, DSIE.Google ScholarGoogle Scholar
  17. Bruce Croft, Donald Metzler, and Trevor Strohman. 2009. Search Engines: Information Retrieval in Practice (1st ed.). Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mona Diab, Kadri Hacioglu, and Daniel Jurafsky. 2004. Automatic tagging of arabic text: From raw text to base phrase chunks. In Proceedings of HLT-NAACL 2004: Short Papers (HLT-NAACL-Short’04). Association for Computational Linguistics, Stroudsburg, PA, 149--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Xiaowen Dong, Dimitrios Mavroeidis, Francesco Calabrese, and Pascal Frossard. 2015. Multiscale event detection in social media. Data Min. Knowl. Discov. 29, 5 (2015), 1374--1405. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Gunes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 1 (2004), 457--479. Google ScholarGoogle ScholarCross RefCross Ref
  21. Atefeh Farzindar and Khreich Wael. 2015. A survey of techniques for event detection in twitter. Comput. Intell. 31, 1 (Feb. 2015), 132--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 1998. Additive logistic regression: A statistical view of boosting. Ann. Stat. 28 (1998), 2000.Google ScholarGoogle Scholar
  23. Brent Hecht, Lichan Hong, Bongwon Suh, and Ed H. Chi. 2011. Tweets from justin bieber’s heart: The dynamics of the location field in user profiles. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). ACM, New York, NY, 237--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing social media messages in mass emergency: A survey. ACM Comput. Surv. 47, 4, Article 67 (June 2015), 38 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. 2014. AIDR: Artificial intelligence for disaster response. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14 Companion). ACM, 159--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Akshaya Iyengar, Tim Finin, and Anupam Joshi. 2011. Content-based prediction of temporal boundaries for events in twitter. In Proceedings of the 3rd IEEE International Conference on Social Computing. 186--191.Google ScholarGoogle ScholarCross RefCross Ref
  27. Thorsten Joachims. 1998. Text categorization with suport vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning (ECML’98). Springer-Verlag, London, UK, 137--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. George Karypis, Rajat Aggarwal, Vipin Kumar, and Shashi Shekhar. 1997. Multilevel hypergraph partitioning: Application in VLSI domain. In Proceedings of the 34th Annual Design Automation Conference (DAC’97). ACM, New York, NY, 526--529. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. David D. Lewis. 1998. Naive (bayes) at forty: The independence assumption in information retrieval. In Proceedings of the 10th European Conference on Machine Learning (ECML’98). Springer-Verlag, London, UK, 4--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Rui Li, Kin Hou Lei, Ravi Khadiwala, and Kevin Chen-Chuan Chang. 2012. TEDAS: A twitter-based event detection and analysis system. In ICDE. IEEE Computer Society, 1273--1276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yue Lu, ChengXiang Zhai, and Neel Sundaresan. 2009. Rated aspect summarization of short comments. In Proceedings of the 18th International Conference on World Wide Web (WWW’09). ACM, New York, NY, 131--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zongyang Ma, Aixin Sun, and Gao Cong. 2013. On predicting the popularity of newly emerging hashtags in twitter. J. Assoc. Inf. Sci. Technol. 64, 7 (2013), 1399--1410.Google ScholarGoogle ScholarCross RefCross Ref
  33. Adam Marcus, Michael S. Bernstein, Osama Badar, David R. Karger, Samuel Madden, and Robert C. Miller. 2011. Twitinfo: Aggregating and visualizing microblogs for event exploration. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’11). ACM, New York, NY, 227--236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Michael Mathioudakis and Nick Koudas. 2010. TwitterMonitor: Trend detection over the twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, New York, NY, 1155--1158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Donald Metzler, Congxing Cai, and Eduard Hovy. 2012. Structured event retrieval over microblog archives. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’12). Association for Computational Linguistics, Stroudsburg, PA, 646--655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP’04). Association for Computational Linguistics, 404--411.Google ScholarGoogle Scholar
  37. Pabitra Mitra, C. A. Murthy, and Sankar K. Pal. 2002. Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24, 3 (March 2002), 301--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. United kingdom Metropolitan Police Service MPS. 2012. 4 Days in August: Strategic Review into the Disorder of August 2011 - final report. Retrieved January 1, 2016 from http://www.met.police.uk/foi/pdfs/priorities_and_how_we_are_doing/corpo rate/4_days_in_august.pdf.Google ScholarGoogle Scholar
  39. Jeffrey Nichols, Jalal Mahmud, and Clemens Drews. 2012. Summarizing sporting events using twitter. In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces (IUI’12). ACM, New York, NY, 189--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yukio Ohsawa, Nels E. Benson, and Masahiko Yachida. 1998. KeyGraph: Automatic indexing by co-occurrence graph based on building construction metaphor. In Proceedings of the Advances in Digital Libraries Conference (ADL’98). IEEE Computer Society, Washington, DC, 12--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Andrei Olariu. 2014. Efficient online summarization of microblogging streams. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014). 236--240.Google ScholarGoogle ScholarCross RefCross Ref
  42. Alexandra Olteanu, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2014. CrisisLex: A lexicon for collecting and filtering microblogged communications in crises. In Proceedings of the 8th International Conference on Weblogs and Social Media (ICWSM 2014).Google ScholarGoogle Scholar
  43. Saša Petrović Miles Osborne, Richard McCreadie, Craig Macdonald, Iadh Ounis, and Luke Shrimptonand. 2013. Can twitter replace newswire for breaking news? In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media (ICWSM’13).Google ScholarGoogle Scholar
  44. Chi-Chun Pan and Prasenjit Mitra. 2011. Event detection with spatial latent dirichlet allocation. In Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries (JCDL’11). ACM, New York, NY, 349--358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming first story detection with application to twitter. In Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT’10). Association for Computational Linguistics, Stroudsburg, PA, 181--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Swit Phuvipadawat and Tsuyoshi Murata. 2010. Breaking news detection and tracking in twitter. In Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). 120--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. M. F. Porter. 1997. An algorithm for suffix stripping. In Readings in Information Retrieval, Karen Sparck Jones and Peter Willett (Eds.). Morgan Kaufmann, San Francisco, CA, 313--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Dragomir R. Radev, Sasha Blair-Goldensohn, and Zhu Zhang. 2001. Experiments in single and multidocument summarization using MEAD. First Document Understanding Conference (2001).Google ScholarGoogle Scholar
  49. Joel W. Reed, Yu Jiao, Thomas E. Potok, Brian A. Klump, Mark T. Elmore, and Ali R. Hurson. 2006. TF-ICF: A new term weighting scheme for clustering dynamic data streams. In Proceedings of the 5th International Conference on Machine Learning and Applications (ICMLA’06). IEEE Computer Society, Washington, DC, 258--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24, 5 (Aug. 1988), 513--523. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Hassan Sayyadi and Louiqa Raschid. 2013. A graph analytical approach for topic detection. ACM Trans. Internet Technol. 13, 2, Article 4 (Dec. 2013), 23 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Emmanouil Schinas, Georgios Petkos, Symeon Papadopoulos, and Y. Kompatsiaris. 2012. CERTH @ mediaeval 2012 social event detection task. In Proceedings of the MediaEval 2012 Workshop. 6--7.Google ScholarGoogle Scholar
  53. Axel Schulz, Benedikt Schmidt, and Thorsten Strufe. 2015. Small-scale incident detection based on microposts. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (HT’15). ACM, New York, NY, 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. David A. Shamma, Lyndon Kennedy, and Elizabeth F. Churchill. 2010. Tweetgeist: Can the twitter timeline reveal the structure of broadcast events?, Horizon, in CSCW 2010 (2010).Google ScholarGoogle Scholar
  55. Beaux Sharifi, Mark-Anthony Hutton, and Jugal Kalita. 2010. Summarizing microblogs automatically. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT’10). Association for Computational Linguistics, Stroudsburg, PA, 685--688. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Chao Shen, Fei Liu, Fuliang Weng, and Tao Li. 2013. A participant-based approach for event summarization using twitter streams. In Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics. 1152--1162.Google ScholarGoogle Scholar
  57. Kate Starbird and Leysia Palen. 2012. (How) will the revolution be retweeted?: Information diffusion and the 2011 egyptian uprising. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW’12). ACM, New York, NY, 7--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Nicholas A. Thapen, Donal Stephen Simmie, and Chris Hankin. 2015. The early bird catches the term: Combining twitter and news data for event detection and situational awareness. CoRR abs/1504.02335 (2015).Google ScholarGoogle Scholar
  59. Mike Thelwall, Kevan Buckley, and Georgios Paltoglou. 2011. Sentiment in twitter events. J. Am. Soc. Inf. Sci. Technol. 62, 2 (Feb. 2011), 406--418. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Lucy Vanderwende, Hisami Suzuki, Chris Brockett, and Ani Nenkova. 2007. Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manage. 43, 6 (Nov. 2007), 1606--1618. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Konstantinos N. Vavliakis, Andreas L. Symeonidis, and Pericles A. Mitkas. 2013. Event identification in web social media through named entity recognition and topic modeling. Data Knowl. Eng. 88 (2013), 1--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Sarah Vieweg, Carlos Castillo, and Muhammad Imran. 2014. Integrating social media communications into the rapid assessment of sudden onset disasters. In Proceedings of the 6th International Conference on Social Informatics. 444--461.Google ScholarGoogle ScholarCross RefCross Ref
  63. Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia Palen. 2010. Microblogging during two natural hazards events: What twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’10). ACM, New York, NY, 1079--1088. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Maximilian Walther and Michael Kaisser. 2013. Geo-spatial event detection in the twitter stream. In Proceedings of the 35th European Conference on Advances in Information Retrieval (ECIR’13). Springer-Verlag, Berlin, 356--367. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Kazufumi Watanabe, Masanao Ochi, Makoto Okabe, and Rikio Onai. 2011. Jasmine: A real-time local-event detection system based on geolocation information propagated to microblogs. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). ACM, New York, NY, 2541--2544. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Jianshu Weng and Bu-Sung Lee. 2011. Event detection in twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM’11).Google ScholarGoogle Scholar
  67. Matthew Williams and Pete Burnap. 2015. Cyberhate on social media in the aftermath of Woolwich: A case study in computational criminology and big data. Br. J. Criminol. (2015), 1--28.Google ScholarGoogle Scholar
  68. Wei Xu, Ralph Grishman, Adam Meyers, and Alan Ritter. 2013. A preliminary study of tweet summarization using information extraction. In Proceedings of the Conference of the Association of Computational Linguistics and Workshop on Language in Social Media (LASM’13). 20--29.Google ScholarGoogle Scholar
  69. Duan Yajuan, Chen Zhumin, Wei Furu, Zhou Ming, and Heung Y. Shum. 2012. Twitter topic summarization by ranking tweets using social influence and content quality. In Proceedings of the 24th International Conference on Computational Linguistics (COLING’12). 763--780.Google ScholarGoogle Scholar
  70. Xintian Yang, Amol Ghoting, Yiye Ruan, and Srinivasan Parthasarathy. 2012. A framework for summarizing and analyzing twitter feeds. In Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’12). ACM, 370--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Jie Yin, Sarvnaz Karimi, Andrew Lampert, Mark A. Cameron, Bella Robinson, and Robert Power. 2015. Using social media to enhance emergency situation awareness: Extended abstract. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, IJCAI. 4234--4239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Arkaitz Zubiaga, Damiano Spina, Enrique Amigó, and Julio Gonzalo. 2012. Towards real-time summarization of scheduled events from twitter streams. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT’12). ACM, New York, NY, 319--320. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Can We Predict a Riot? Disruptive Event Detection Using Twitter

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Internet Technology
            ACM Transactions on Internet Technology  Volume 17, Issue 2
            Special Issue on Advances in Social Computing and Regular Papers
            May 2017
            249 pages
            ISSN:1533-5399
            EISSN:1557-6051
            DOI:10.1145/3068849
            • Editor:
            • Munindar P. Singh
            Issue’s Table of Contents

            Copyright © 2017 Owner/Author

            This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 27 March 2017
            • Accepted: 1 September 2016
            • Revised: 1 July 2016
            • Received: 1 March 2016
            Published in toit Volume 17, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader