skip to main content
research-article

Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News

Authors Info & Claims
Published:26 June 2019Publication History
Skip Abstract Section

Abstract

Fake news is nowadays an issue of pressing concern, given its recent rise as a potential threat to high-quality journalism and well-informed public discourse. The Fake News Challenge (FNC-1) was organized in early 2017 to encourage the development of machine-learning-based classification systems for stance detection (i.e., for identifying whether a particular news article agrees, disagrees, discusses, or is unrelated to a particular news headline), thus helping in the detection and analysis of possible instances of fake news. This article presents a novel approach to tackle this stance detection problem, based on the combination of string similarity features with a deep neural network architecture that leverages ideas previously advanced in the context of learning-efficient text representations, document classification, and natural language inference. Specifically, we use bi-directional Recurrent Neural Networks (RNNs), together with max-pooling over the temporal/sequential dimension and neural attention, for representing (i) the headline, (ii) the first two sentences of the news article, and (iii) the entire news article. These representations are then combined/compared, complemented with similarity features inspired on other FNC-1 approaches, and passed to a final layer that predicts the stance of the article toward the headline. We also explore the use of external sources of information, specifically large datasets of sentence pairs originally proposed for training and evaluating natural language inference methods to pre-train specific components of the neural network architecture (e.g., the RNNs used for encoding sentences). The obtained results attest to the effectiveness of the proposed ideas and show that our model, particularly when considering pre-training and the combination of neural representations together with similarity features, slightly outperforms the previous state of the art.

References

  1. Darren Baker Ali K. Chaudhry and Philipp Thun-Hohenstein. 2017. Stance detection for the fake news challenge: Identifying textual relationships with deep neural nets. CS224n: Natural Language Processing with Deep Learning (2017).Google ScholarGoogle Scholar
  2. Gaurav Bhatt, Aman Sharma, Shivam Sharma, Ankush Nagpal, Balasubramanian Raman, and Ankush Mittal. 2018. Combining neural, statistical and external features for fake news stance identification. In Proceedings of the The Web Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Peter Bourgonje, Julian Moreno Schneider, and Georg Rehm. 2017. From clickbait to fake news detection: An approach based on detecting the stance of headlines to articles. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  4. Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle Scholar
  5. Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, et al. 2018. Universal sentence encoder. Arxiv Preprint Arxiv:1803.11175 (2018).Google ScholarGoogle Scholar
  6. Delphine Charlet and Geraldine Damnati. 2017. SimBow at SemEval-2017 Task 3: Soft-cosine semantic similarity between questions for community question answering. In Proceedings of the International Workshop on Semantic Evaluation.Google ScholarGoogle ScholarCross RefCross Ref
  7. Qian Chen, Xiaodan Zhu, Zhen-Hua Ling, Si Wei, Hui Jiang, and Diana Inkpen. 2017. Enhanced LSTM for natural language inference. In Proceedings of the Annual Meeting of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  8. Qian Chen, Xiaodan Zhu, Zhen-Hua Ling, Si Wei, Hui Jiang, and Diana Inkpen. 2017. Recurrent neural network-based sentence encoder with gated attention for natural language inference. In Proceedings of the Workshop on Evaluating Vector Space Representations for NLP.Google ScholarGoogle ScholarCross RefCross Ref
  9. Jihun Choi, Taeuk Kim, and Sang goo Lee. 2018. Cell-aware stacked LSTMs for modeling sentences. Arxiv Preprint Arxiv:1809.02279 (2018).Google ScholarGoogle Scholar
  10. J. Choi, K. M. Yoo, and S.-g. Lee. 2017. Learning to compose task-specific tree structures. In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence.Google ScholarGoogle Scholar
  11. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS Workshop on Deep Learning.Google ScholarGoogle Scholar
  12. Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, and Antoine Bordes. 2017. Supervised learning of universal sentence representations from natural language inference data. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  13. Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, and Åukasz Kaiser. 2019. Universal transformers. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  14. Francisco Duarte, Bruno Martins, Cátia Sousa Pinto, and Mário J. Silva. 2018. A deep learning method for ICD-10 coding of free-text death certificates. In Proceedings of the EPIA Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  15. Yoav Goldberg. 2016. A primer on neural network models for natural language processing. J. Artif. Intell. Res. 57, 1 (2016), 345--420. Google ScholarGoogle ScholarCross RefCross Ref
  16. Yichen Gong, Heng Luo, and Jian Zhang. 2018. Natural language inference over interaction space. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  17. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neur. Comput. 9, 8 (1997).Google ScholarGoogle Scholar
  18. Jinbae Im and Sungzoon Cho. 2017. Distance-based self-attention network for natural language inference. Arxiv Preprint Arxiv:1712.02047 (2017).Google ScholarGoogle Scholar
  19. Krzysztof Janowicz and Grant McKenzie. 2017. How “alternative” are alternative facts? measuring statement coherence via spatial analysis. In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems.Google ScholarGoogle Scholar
  20. Richard Socher Jeffrey Pennington and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle Scholar
  21. Kevin Gimpel John Wieting, Mohit Bansal and Karen Livescu. 2016. Towards universal paraphrastic sentence embeddings. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  22. Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  23. Ryan Kiros, Yukun Zhu, Ruslan R. Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. In Proceedings of the Neural Information Processing Systems Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lev Konstantinovskiy, Oliver Price, Mevan Babakar, and Arkaitz Zubiaga. 2018. Towards automated factchecking: Developing an annotation schema and benchmark for consistent automated claim detection. In Proceedings of the EMNLP Workshop on Fact Extraction and Verification.Google ScholarGoogle Scholar
  25. Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In Proceedings of the International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. David M. J. Lazer, Matthew A. Baum, Yochai Benkler, Adam J. Berinsky, Kelly M. Greenhill, Filippo Menczer, Miriam J. Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, et al. 2018. The science of fake news. Science 359, 6380 (2018).Google ScholarGoogle Scholar
  27. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Proceedings of the ACL Workshop on Text Summarization Branches Out.Google ScholarGoogle Scholar
  28. Andre Martins and Ramon Astudillo. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In Proceedings o the International Conference on Machine Learning.Google ScholarGoogle Scholar
  29. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  30. Mitra Mohtarami, Ramy Baly, James Glass, Preslav Nakov, Lluis Marquez, and Alessandro Moschitti. 2018. Automatic stance detection using end-to-end memory networks. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  31. Yixin Nie and Mohit Bansal. 2017. Shortcut-stacked sentence encoders for multi-domain inference. In Proceedings of the Workshop on Evaluating Vector Space Representations for NLP.Google ScholarGoogle ScholarCross RefCross Ref
  32. Boyuan Pan, Yazheng Yang, Zhou Zhao, Yueting Zhuang, Deng Cai, and Xiaofei He. 2018. Discourse marker augmented network with reinforcement learning for natural language inference. In Proceedings of the Annual Meeting of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  33. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the Annual Meeting on Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Verónica Pérez-Rosas, Bennett Kleinberg, Alexandra Lefevre, and Rada Mihalcea. 2018. Automatic detection of fake news. In Proceedings of the International Conference on Computational Linguistics.Google ScholarGoogle Scholar
  35. Oskar Triebe Pfohl and Ferdinand Legros. 2017. Stance detection for the fake news challenge with attention and conditional encoding. CS224n: Natural Language Processing with Deep Learning (2017).Google ScholarGoogle Scholar
  36. Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, and Gerhard Weikum. 2018. DeClarE: Debunking fake news and false claims using evidence-aware deep learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  37. Benjamin Riedel, Isabelle Augenstein, Georgios P Spithourakis, and Sebastian Riedel. 2017. A simple but tough-to-beat baseline for the fake news challenge stance detection task. Arxiv Preprint Arxiv:1707.03264 (2017).Google ScholarGoogle Scholar
  38. T. Shen, T. Zhou, G. Long, J. Jiang, S. Pan, and C. Zhang. 2018. DiSAN: Directional self-attention network for RNN/CNN-Free language understanding. In Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence.Google ScholarGoogle Scholar
  39. Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, and Chengqi Zhang. 2018. Reinforced self-attention network: A hybrid of hard and soft attention for sequence modeling. In Proceedings of the International Joint Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  40. Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2018. FakeNewsNet: A data repository with news content, social context and dynamic information for studying fake news on social media. Arxiv Preprint Arxiv:1809.01286 (2018).Google ScholarGoogle Scholar
  41. Kai Shu, Suhang Wang, and Huan Liu. 2017. Exploiting tri-relationship for fake news detection. Arxiv Preprint Arxiv:1712.07709 (2017).Google ScholarGoogle Scholar
  42. Yi Tay, Luu Anh Tuan, and Siu Cheung Hui. 2018. A compare-propagate architecture with alignment factorization for natural language inference. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle Scholar
  43. Yi Tay, Luu Anh Tuan, and Siu Cheung Hui. 2018. Co-stack residual affinity networks with multi-level attention refinement for matching text sequences. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  44. Yi Tay, Luu Anh Tuan, and Siu Cheung Hui. 2018. A compare-propagate architecture with alignment factorization for natural language inference. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle Scholar
  45. M. Tosik, A. Mallia, and K. Gangopadhyay. 2018. Debunking fake news one feature at a time. Arxiv Preprint Arxiv:1808.02831 (2018).Google ScholarGoogle Scholar
  46. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Neural Information Processing Systems Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Ramakrishna Vedantam, C. Lawrence Zitnick, and Devi Parikh. 2015. CIDEr: Consensus-based image description evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  48. Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018).Google ScholarGoogle Scholar
  49. Adina Williams, Nikita Nangia, and Samuel R. Bowman. 2018. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics.Google ScholarGoogle Scholar
  50. Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2018. DR-BiLSTM: Dependent reading bidirectional LSTM for natural language inference. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics.Google ScholarGoogle Scholar
  51. William E. Winkler. 1990. String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In Proceedings of the Section on Survey Research Methods of the American Statistical Association (1990).Google ScholarGoogle Scholar
  52. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alexander J. Smola, and Eduard H. Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics.Google ScholarGoogle Scholar
  53. Wenpeng Yin, Katharina Kann, Mo Yu, and Hinrich Schütze. 2017. Comparative study of CNN and RNN for natural language processing. Arxiv Preprint Arxiv:1702.01923 (2017).Google ScholarGoogle Scholar
  54. Qi Zeng, Quan Zhou, and Shanshan Xu. 2017. Neural stance detectors for fake news challenge. CS224n: Natural Language Processing with Deep Learning (2017).Google ScholarGoogle Scholar

Index Terms

  1. Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Journal of Data and Information Quality
          Journal of Data and Information Quality  Volume 11, Issue 3
          Special Issue on Combating Digital Misinformation and Disinformation and On the Horizon
          September 2019
          160 pages
          ISSN:1936-1955
          EISSN:1936-1963
          DOI:10.1145/3331015
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 June 2019
          • Accepted: 1 October 2018
          • Revised: 1 September 2018
          • Received: 1 May 2018
          Published in jdiq Volume 11, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format