ABSTRACT
Online media outlets, in a bid to expand their reach and subsequently increase revenue through ad monetisation, have begun adopting clickbait techniques to lure readers to click on articles. The article fails to fulfill the promise made by the headline. Traditional methods for clickbait detection have relied heavily on feature engineering which, in turn, is dependent on the dataset it is built for. The application of neural networks for this task has only been explored partially. We propose a novel approach considering all information found in a social media post. We train a bidirectional LSTM with an attention mechanism to learn the extent to which a word contributes to the post's clickbait score in a differential manner. We also employ a Siamese net to capture the similarity between source and target information. Information gleaned from images has not been considered in previous approaches. We learn image embeddings from large amounts of data using Convolutional Neural Networks to add another layer of complexity to our model. Finally, we concatenate the outputs from the three separate components, serving it as input to a fully connected layer. We conduct experiments over a test corpus of 19538 social media posts, attaining an F1 score of 65.37% on the dataset bettering the previous state-of-the-art, as well as other proposed approaches, feature engineering or otherwise.
- Ankesh Anand, Tanmoy Chakraborty, and Noseong Park . 2017. We used Neural Networks to Detect Clickbaits: You won't believe what happened Next! Advances in Information Retrieval. 39th European Conference on IR Research (ECIR 17) (Lecture Notes in Computer Science). Springer.Google Scholar
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio . 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google Scholar
- Prakhar Biyani, Kostas Tsioutsiouliklis, and John Blackmer . 2016. "8 Amazing Secrets for Getting More Clicks": Detecting Clickbaits in News Streams Using Article Informality. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI'16). AAAI Press, 94--100. deftempurl%http://dl.acm.org/citation.cfm?id=3015812.3015827 tempurl Google ScholarDigital Library
- Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly . 2016. Stop Clickbait: Detecting and preventing clickbaits in online news media. 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (2016), 9--16. Google ScholarDigital Library
- Y. Le Cun, B. Boser, J. S. Denker, R. E. Howard, W. Habbard, L. D. Jackel, and D. Henderson . 1990. Advances in Neural Information Processing Systems 2. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, Chapter Handwritten Digit Recognition with a Back-propagation Network, 396--404. deftempurl%http://dl.acm.org/citation.cfm?id=109230.109279 tempurl Google ScholarDigital Library
- Pieter-Tjerk de Boer, Dirk P. Kroese, Shie Mannor, and Reuven Y. Rubinstein . 2005. A Tutorial on the Cross-Entropy Method. Annals of Operations Research Vol. 134, 1 (01 Feb . 2005), 19--67.Google ScholarCross Ref
- C'ıcero Nogueira Dos Santos and Bianca Zadrozny . 2014. Learning Character-level Representations for Part-of-speech Tagging Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (ICML'14). JMLR.org, II--1818--II--1826. deftempurl%http://dl.acm.org/citation.cfm?id=3044805.3045095 tempurl Google ScholarDigital Library
- Xavier Glorot and Yoshua Bengio . 2010. Understanding the difficulty of training deep feedforward neural networks. Aistats, Vol. Vol. 9. 249--256.Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber . 1997. Long short-term memory. Neural computation Vol. 9, 8 (1997), 1735--1780. Google ScholarDigital Library
- Quoc Le and Tomas Mikolov . 2014. Distributed representations of sentences and documents Proceedings of the 31st International Conference on Machine Learning (ICML-14). 1188--1196. Google ScholarDigital Library
- George Loewenstein . 1994. The Psychology of Curiosity: A Review and Reinterpretation. Vol. 116 (07 . 1994), 75--98.Google Scholar
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean . 2013. Efficient Estimation of Word Representations in Vector Space. CoRR Vol. abs/1301.3781 (2013). deftempurl%http://arxiv.org/abs/1301.3781 tempurlGoogle Scholar
- Paul Neculoiu, Maarten Versteegh, and Mihai Rotaru . 2016. Learning Text Similarity with Siamese Recurrent Networks. (01 . 2016).Google Scholar
- Martin Potthast, Tim Gollub, Kristof Komlossy, Sebastian Schuster, Matti Wiegmann, Erika Garces, Matthias Hagen, and Benno Stein . 2017. Crowdsourcing a Large Corpus of Clickbait on Twitter (to appear).Google Scholar
- Martin Potthast, Sebastian Köpsel, Benno Stein, and Matthias Hagen . 2016. Clickbait Detection. In Advances in Information Retrieval. 38th European Conference on IR Research (ECIR 16) (Lecture Notes in Computer Science), bibfieldeditorNicola Ferro, Fabio Crestani, Marie-Francine Moens, Josiane Mothe, Fabrizio Silvestri, Giorgio Maria Di Nunzio, Claudia Hauff, and Gianmaria Silvello (Eds.), Vol. Vol. 9626. Springer, Berlin Heidelberg New York, 810--817.Google Scholar
- Radim v Rehr uv rek and Petr Sojka . 2010. Software Framework for Topic Modelling with Large Corpora Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45--50. http://is.muni.cz/publication/884893/enGoogle Scholar
- Karen Simonyan and Andrew Zisserman . 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR Vol. abs/1409.1556 (2014).Google Scholar
- Philippe Thomas . 2017. Clickbait Identification using Neural Networks. CoRR Vol. abs/1710.08721 (2017). showeprint{arxiv}1710.08721deftempurl%http://arxiv.org/abs/1710.08721 tempurlGoogle Scholar
- Matthew D Zeiler . 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).Google Scholar
- Yiwei Zhou . 2017. Clickbait Detection in Tweets Using Self-attentive Network. CoRR Vol. abs/1710.05364 (2017). showeprint{arxiv}1710.05364deftempurl%http://arxiv.org/abs/1710.05364 tempurlGoogle Scholar
Index Terms
- Identifying Clickbait: A Multi-Strategy Approach Using Neural Networks
Recommendations
Does Clickbait Actually Attract More Clicks? Three Clickbait Studies You Must Read
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing SystemsStudies show that users do not reliably click more often on headlines classified as clickbait by automated classifiers. Is this because the linguistic criteria (e.g., use of lists or questions) emphasized by the classifiers are not psychologically ...
Misleading Online Content: Recognizing Clickbait as "False News"
WMDD '15: Proceedings of the 2015 ACM on Workshop on Multimodal Deception DetectionTabloid journalism is often criticized for its propensity for exaggeration, sensationalization, scare-mongering, and otherwise producing misleading and low quality news. As the news has moved online, a new form of tabloidization has emerged: ?...
Clickbait Detection
ICSIE '18: Proceedings of the 7th International Conference on Software and Information EngineeringClickbait is a term that describes deceiving web content that uses ambiguity to provoke the user into clicking a link. It aims to increase the number of online readers in order to generate more advertising revenue. Clickbaits are heavily present on ...
Comments