skip to main content
research-article
Open Access

Fake News Early Detection: A Theory-driven Model

Published:11 June 2020Publication History
Skip Abstract Section

Abstract

Massive dissemination of fake news and its potential to erode democracy has increased the demand for accurate fake news detection. Recent advancements in this area have proposed novel techniques that aim to detect fake news by exploring how it propagates on social networks. Nevertheless, to detect fake news at an early stage, i.e., when it is published on a news outlet but not yet spread on social media, one cannot rely on news propagation information as it does not exist. Hence, there is a strong need to develop approaches that can detect fake news by focusing on news content. In this article, a theory-driven model is proposed for fake news detection. The method investigates news content at various levels: lexicon-level, syntax-level, semantic-level, and discourse-level. We represent news at each level, relying on well-established theories in social and forensic psychology. Fake news detection is then conducted within a supervised machine learning framework. As an interdisciplinary research, our work explores potential fake news patterns, enhances the interpretability in fake news feature engineering, and studies the relationships among fake news, deception/disinformation, and clickbaits. Experiments conducted on two real-world datasets indicate the proposed method can outperform the state-of-the-art and enable fake news early detection when there is limited content information.

References

  1. Amol Agrawal. 2016. Clickbait detection using deep learning. In Proceedings of the 2nd International Conference on Next Generation Computing Technologies (NGCT’16). IEEE, 268--272.Google ScholarGoogle Scholar
  2. Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2016. A simple but tough-to-beat baseline for sentence embeddings. In The International Conference on Learning Representations (ICLR’17).Google ScholarGoogle Scholar
  3. Péter Bálint and Géza Bálint. 2009. The Semmelweis-reflex. Orvosi Het. 150, 30 (2009), 1430.Google ScholarGoogle ScholarCross RefCross Ref
  4. Lawrence E. Boehm. 1994. The validity effect: A search for mediating variables. Person. Soc. Psychol. Bull. 20, 3 (1994), 285--293.Google ScholarGoogle ScholarCross RefCross Ref
  5. Finn Brunton. 2013. Spam: A Shadow History of the Internet. The Mit Press.Google ScholarGoogle ScholarCross RefCross Ref
  6. Sonia Castelo, Thais Almeida, Anas Elghafari, Aécio Santos, Kien Pham, Eduardo Nakamura, and Juliana Freire. 2019. A topic-agnostic approach for identifying fake news pages. In Proceedings of the World Wide Web Conference. ACM, 975--980.Google ScholarGoogle Scholar
  7. Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on Twitter. In Proceedings of the 20th International Conference on World Wide Web. ACM, 675--684.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. 2016. Stop clickbait: Detecting and preventing clickbaits in online news media. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE Press, 9--16.Google ScholarGoogle ScholarCross RefCross Ref
  9. Abhijnan Chakraborty, Rajdeep Sarkar, Ayushi Mrigen, and Niloy Ganguly. 2017. Tabloids in the era of social media?: Understanding the production and consumption of clickbaits in Twitter. Proc. ACM on Hum.-comput. Interact. 1, CSCW (2017), 30.Google ScholarGoogle Scholar
  10. Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 785--794.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yimin Chen, Niall J. Conroy, and Victoria L. Rubin. 2015. Misleading online content: Recognizing clickbait as false news. In Proceedings of the ACM Workshop on Multimodal Deception Detection. ACM, 15--19.Google ScholarGoogle Scholar
  12. Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M. Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PloS One 10, 6 (2015), e0128193.Google ScholarGoogle ScholarCross RefCross Ref
  13. Manqing Dong, Lina Yao, Xianzhi Wang, Boualem Benatallah, and Chaoran Huang. 2019. Similarity-aware deep attentive model for clickbait detection. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 56--69.Google ScholarGoogle Scholar
  14. Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 601--610.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mengnan Du, Ninghao Liu, and Xia Hu. 2019. Techniques for interpretable machine learning. Commun. ACM 63, 1 (2019), 68--77.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Song Feng, Ritwik Banerjee, and Yejin Choi. 2012. Syntactic stylometry for deception detection. In Proceedings of the 50th Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics, 171--175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Alison Gianotto. 2014. Downworthy: A browser plugin to turn hyperbolic viral headlines into what they really mean. downworthy.snipe.net/. (2014).Google ScholarGoogle Scholar
  18. Manish Gupta, Peixiang Zhao, and Jiawei Han. 2012. Evaluating event credibility on Twitter. In Proceedings of the SIAM International Conference on Data Mining. SIAM, 153--164.Google ScholarGoogle ScholarCross RefCross Ref
  19. Shashank Gupta, Raghuveer Thirukovalluru, Manjira Sinha, and Sandya Mannarswamy. 2018. CIMTDetect: A community infused matrix-tensor coupled factorization based method for fake news detection. Arxiv Preprint Arxiv:1809.05252 (2018).Google ScholarGoogle Scholar
  20. Joan B. Hooper. 1974. On assertive predicates. In Syntax and Semantics, Vol. 4. Indiana University Linguistics Club.Google ScholarGoogle Scholar
  21. Kokil Jaidka, Tanya Goyal, and Niyati Chhaya. 2018. Predicting email and article clickthroughs with domain-adaptive language models. In Proceedings of the 10th ACM Conference on Web Science. ACM, 177--184.Google ScholarGoogle Scholar
  22. Yangfeng Ji and Jacob Eisenstein. 2014. Representation learning for text-level discourse parsing. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 13--24.Google ScholarGoogle Scholar
  23. Zhiwei Jin, Juan Cao, Yongdong Zhang, and Jiebo Luo. 2016. News verification by exploiting conflicting social viewpoints in microblogs. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’16). 2972--2978.Google ScholarGoogle Scholar
  24. Marcia K. Johnson and Carol L. Raye. 1981. Reality monitoring.Psychol. Rev. 88, 1 (1981), 67.Google ScholarGoogle ScholarCross RefCross Ref
  25. Junaed Younus Khan, Md Khondaker, Tawkat Islam, Anindya Iqbal, and Sadia Afroz. 2019. A benchmark study on machine learning methods for fake news detection. Arxiv Preprint Arxiv:1905.04749 (2019).Google ScholarGoogle Scholar
  26. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 1188--1196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yang Liu and Yi-Fang Brook Wu. 2018. Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  28. George Loewenstein. 1994. The psychology of curiosity: A review and reinterpretation. Psychol. Bull. 116, 1 (1994), 75.Google ScholarGoogle ScholarCross RefCross Ref
  29. Colin MacLeod, Andrew Mathews, and Philip Tata. 1986. Attentional bias in emotional disorders.J. Abnorm. Psychol. 95, 1 (1986), 15.Google ScholarGoogle ScholarCross RefCross Ref
  30. Steven A. McCornack, Kelly Morrison, Jihyun Esther Paik, Amy M. Wisner, and Xun Zhu. 2014. Information manipulation theory 2: A propositional theory of deceptive discourse production. J. Lang. Soc. Psychol. 33, 4 (2014), 348--377.Google ScholarGoogle ScholarCross RefCross Ref
  31. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. Arxiv Preprint Arxiv:1301.3781 (2013).Google ScholarGoogle Scholar
  32. Federico Monti, Fabrizio Frasca, Davide Eynard, Damon Mannion, and Michael M. Bronstein. 2019. Fake news detection on social media using geometric deep learning. Arxiv Preprint Arxiv:1902.06673 (2019).Google ScholarGoogle Scholar
  33. Maximilian Nickel, Kevin Murphy, Volker Tresp, and Evgeniy Gabrilovich. 2016. A review of relational machine learning for knowledge graphs. Proc. IEEE 104, 1 (2016), 11--33.Google ScholarGoogle ScholarCross RefCross Ref
  34. Raymond S. Nickerson. 1998. Confirmation bias: A ubiquitous phenomenon in many guises. Rev. Gen. Psychol. 2, 2 (1998), 175.Google ScholarGoogle ScholarCross RefCross Ref
  35. Jeppe Nørregaard, Benjamin D. Horne, and Sibel Adalı. 2019. NELA-GT-2018: A large multi-labelled news dataset for the study of misinformation in news articles. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 630--638.Google ScholarGoogle ScholarCross RefCross Ref
  36. Ray Oshikawa, Jing Qian, and William Yang Wang. 2018. A survey on natural language processing for fake news detection. Arxiv Preprint Arxiv:1811.00770 (2018).Google ScholarGoogle Scholar
  37. Shivam B. Parikh and Pradeep K. Atrey. 2018. Media-rich fake news detection: A survey. In Proceedings of the IEEE Conference on Multimedia Information Processing and Retrieval (MIPR’18). IEEE, 436--441.Google ScholarGoogle Scholar
  38. Shivam B. Parikh, Vikram Patil, Ravi Makawana, and Pradeep K. Atrey. 2019. Towards impact scoring of fake news. In Proceedings of the IEEE Conference on Multimedia Information Processing and Retrieval (MIPR’19). IEEE, 529--533.Google ScholarGoogle Scholar
  39. James W. Pennebaker, Ryan L. Boyd, Kayla Jordan, and Kate Blackburn. 2015. The Development and Psychometric Properties of LIWC’15. Technical Report. The University of Texas at Austin.Google ScholarGoogle Scholar
  40. Verónica Pérez-Rosas, Bennett Kleinberg, Alexandra Lefevre, and Rada Mihalcea. 2017. Automatic detection of fake news. Arxiv Preprint Arxiv:1708.07104 (2017).Google ScholarGoogle Scholar
  41. Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, and Benno Stein. 2017. A stylometric inquiry into hyperpartisan and fake news. Arxiv Preprint Arxiv:1702.05638 (2017).Google ScholarGoogle Scholar
  42. Martin Potthast, Sebastian Köpsel, Benno Stein, and Matthias Hagen. 2016. Clickbait detection. In Proceedings of the European Conference on Information Retrieval. Springer, 810--817.Google ScholarGoogle Scholar
  43. Kenneth Rapoza. 2017. Can “fake news” impact the stock market? Retrieved from www.forbes.com/sites/kenrapoza/2017/02/26/can-fake-news-impact-the-stock-market/ (9. 7. 2018).Google ScholarGoogle Scholar
  44. Marta Recasens, Cristian Danescu-Niculescu-Mizil, and Dan Jurafsky. 2013. Linguistic models for analyzing and detecting biased language. In Proceedings of the 51st Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1650--1659.Google ScholarGoogle Scholar
  45. Victoria L. Rubin. 2010. On deception and deception detection: Content analysis of computer-mediated stated beliefs. Proc. Assoc. Inf. Sci. Technol. 47, 1 (2010), 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  46. Victoria L. Rubin and Tatiana Lukoianova. 2015. Truth and deception at the rhetorical structure level. J. Assoc. Inf. Sci. Technol. 66, 5 (2015), 905--917.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Natali Ruchansky, Sungyong Seo, and Yan Liu. 2017. CSI: A hybrid deep model for fake news detection. In Proceedings of the ACM Conference on Information and Knowledge Management. ACM, 797--806.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Baoxu Shi and Tim Weninger. 2016. Discriminative predicate path mining for fact checking in knowledge graphs. Knowl-based Syst. 104 (2016), 123--133.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Kai Shu, Limeng Cui, Suhang Wang, Dongwon Lee, and Huan Liu. 2019. dEFEND: Explainable fake news detection. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE Press.Google ScholarGoogle Scholar
  50. Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2018. FakeNewsNet: A data repository with news content, social context, and dynamic information for studying fake news on social media. Arxiv Preprint Arxiv:1809.01286 (2018).Google ScholarGoogle Scholar
  51. Kai Shu, Suhang Wang, and Huan Liu. 2019. Beyond news contents: The role of social context for fake news detection. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining. ACM, 312--320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Craig Silverman. 2016. This analysis shows how viral fake election news stories outperformed real news on Facebook. BuzzFeed News 16 (2016).Google ScholarGoogle Scholar
  53. Niraj Sitaula, Chilukuri K. Mohan, Jennifer Grygiel, Xinyi Zhou, and Reza Zafarani. 2019. Credibility-based fake news detection. Arxiv Preprint Arxiv:1911.00643 (2019).Google ScholarGoogle Scholar
  54. Amos Tversky and Daniel Kahneman. 1974. Judgment under uncertainty: Heuristics and biases. Science 185, 4157 (1974), 1124--1131.Google ScholarGoogle Scholar
  55. Udo Undeutsch. 1967. Beurteilung der glaubhaftigkeit von aussagen. Handb. Psychol. 11 (1967), 26--181.Google ScholarGoogle Scholar
  56. Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146--1151.Google ScholarGoogle Scholar
  57. William Yang Wang. 2017. “Liar, liar pants on fire”: A new benchmark dataset for fake news detection. Arxiv Preprint Arxiv:1705.00648 (2017).Google ScholarGoogle Scholar
  58. Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, and Jing Gao. 2018. EANN: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining. ACM, 849--857.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Ke Wu, Song Yang, and Kenny Q. Zhu. 2015. False rumors detection on Sina Eeibo by propagation structures. In Proceedings of the IEEE 31st International Conference on Data Engineering (ICDE’15). IEEE, 651--662.Google ScholarGoogle Scholar
  60. Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. 2014. Social Media Mining: An Introduction. Cambridge University Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Reza Zafarani, Xinyi Zhou, Kai Shu, and Huan Liu. 2019. Fake news research: Theories, detection strategies, and open problems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining. ACM, 3207--3208.Google ScholarGoogle Scholar
  62. Amy X. Zhang, Aditya Ranganathan, Sarah Emlen Metz, Scott Appling, Connie Moon Sehat, Norman Gilmore, Nick B. Adams, Emmanuel Vincent, Jennifer Lee, Martin Robbins, et al. 2018. A structured response to misinformation: Defining and annotating credibility indicators in news articles. In Proceedings of the Web Conference. International World Wide Web Conferences Steering Committee, 603--612.Google ScholarGoogle Scholar
  63. Jiawei Zhang, Limeng Cui, Yanjie Fu, and Fisher B. Gouza. 2018. Fake news detection with deep diffusive network model. Arxiv Preprint Arxiv:1805.08751 (2018).Google ScholarGoogle Scholar
  64. Xinyi Zhou, Jindi Wu, and Reza Zafarani. 2020. SAFE: Similarity-aware multi-modal fake news detection. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer.Google ScholarGoogle Scholar
  65. Xinyi Zhou and Reza Zafarani. 2018. Fake news: A survey of research, detection methods, and opportunities. Arxiv Preprint Arxiv:1812.00315 (2018).Google ScholarGoogle Scholar
  66. Xinyi Zhou and Reza Zafarani. 2019. Network-based fake news detection: A pattern-driven approach. SIGKDD Explor. 21, 2 (2019), 48--60.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Miron Zuckerman, Bella M. DePaulo, and Robert Rosenthal. 1981. Verbal and nonverbal communication of deception. In Proceedings of the Advances in Experimental Social Psychology. Vol. 14. Elsevier, 1--59.Google ScholarGoogle Scholar

Index Terms

  1. Fake News Early Detection: A Theory-driven Model

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image Digital Threats: Research and Practice
                Digital Threats: Research and Practice  Volume 1, Issue 2
                Field Notes
                June 2020
                139 pages
                EISSN:2576-5337
                DOI:10.1145/3403598
                Issue’s Table of Contents

                Copyright © 2020 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 11 June 2020
                • Online AM: 7 May 2020
                • Accepted: 1 December 2019
                • Revised: 1 November 2019
                • Received: 1 April 2019
                Published in dtrap Volume 1, Issue 2

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article
                • Research
                • Refereed

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              HTML Format

              View this article in HTML Format .

              View HTML Format