Abstract
With the advent of social network services, Arabs’ opinions on the web have attracted many researchers in recent years toward detecting and classifying sentiments in Arabic tweets and reviews. However, the impact of word embeddings vectors (WEVs) initialization and dataset balance on Arabic sentiment classification using deep learning has not been thoroughly studied. In this article, a multi-channel embedding convolutional neural network (MCE-CNN) is proposed to improve Arabic sentiment classification by learning sentiment features from different text domains, word, and character n-grams levels. MCE-CNN encodes a combination of different pre-trained word embeddings into the embedding block at each embedding channel and trains these channels in parallel. Besides, a separate feature extraction module implemented in a CNN block is used to extract more relevant sentiment features. These channels and blocks help to start training on high-quality WEVs and fine-tuning them. The performance of MCE-CNN is evaluated on several standard balanced and imbalanced datasets to reflect real-world use cases. Experimental results show that MCE-CNN provides a high classification accuracy and benefits from the second embedding channel on both standard Arabic and dialectal Arabic text, which outperforms state-of-the-art methods.
- Nawaf A. Abdulla, Nizar A. Ahmed, Mohammad A. Shehab, and Mahmoud Al-Ayyoub. 2013. Arabic sentiment analysis: Lexicon-based and corpus-based. In Proceedings of the 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT). 1--6.Google ScholarCross Ref
- Sadam Al-Azani and El-Sayed M. El-Alfy. 2017a. Hybrid deep learning for sentiment polarity determination of Arabic microblogs. In Proceedings of the International Conference on Neural Information Processing. Springer, 491--500.Google Scholar
- Sadam Al-Azani and El-Sayed M. El-Alfy. 2017b. Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short Arabic text. Procedia Comput. Sci. 109 (2017), 359--366.Google ScholarCross Ref
- Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. 2013. Polyglot: Distributed word representations for multilingual NLP. arXiv preprint arXiv:1307.1662 (2013).Google Scholar
- Ahmad Al-Sallab, Ramy Baly, Hazem Hajj, Khaled Bashir Shaban, Wassim El-Hajj, and Gilbert Badaro. 2017. AROMA: A recursive deep learning model for opinion mining in Arabic as a low resource language. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 16, 4 (2017), 25. Google ScholarDigital Library
- Ahmad Al Sallab, Hazem Hajj, Gilbert Badaro, Ramy Baly, Wassim El Hajj, and Khaled Bashir Shaban. 2015. Deep learning models for sentiment analysis in Arabic. In Proceedings of the 2nd Workshop on Arabic Natural Language Processing. 9--17.Google ScholarCross Ref
- Mohammad Al-Smadi, Omar Qawasmeh, Mahmoud Al-Ayyoub, Yaser Jararweh, and Brij Gupta. 2018. Deep recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels’ reviews. J. Comput. Sci. 27 (2018), 386--393.Google ScholarCross Ref
- Abdulaziz M. Alayba, Vasile Palade, Matthew England, and Rahat Iqbal. 2017. Arabic language sentiment analysis on health services. In Proceedings of the 1st International Workshop on Arabic Script Analysis and Recognition (ASAR). 114--118.Google ScholarCross Ref
- Khaled Mohammad Alomari, Hatem M. ElSherif, and Khaled Shaalan. 2017. Arabic tweets sentimental analysis using machine learning. In Proceedings of the International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, 602--610.Google ScholarCross Ref
- A. Aziz Altowayan and Lixin Tao. 2016. Word embeddings for Arabic sentiment analysis. In IEEE International Conference on Big Data. 3820--3825.Google ScholarCross Ref
- Mohamed Aly and Amir Atiya. 2013. LABR: A large scale Arabic book reviews dataset. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 2. 494--498.Google Scholar
- Tressy Arts, Yonatan Belinkov, Nizar Habash, Adam Kilgarriff, and Vit Suchomel. 2014. arTenTen: Arabic corpus and word sketches. J. King Saud Univ. Comput. Inf. Sci. 26, 4 (2014), 357--371. Google ScholarDigital Library
- Ramy Baly, Hazem Hajj, Nizar Habash, Khaled Bashir Shaban, and Wassim El-Hajj. 2017. A sentiment treebank and morphologically enriched recursive deep models for effective sentiment analysis in Arabic. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 16, 4 (2017), 23. Google ScholarDigital Library
- Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016).Google Scholar
- Emil St. Chifu, Tiberiu St. Letia, and Viorica R. Chifu. 2015. Unsupervised aspect level sentiment analysis using self-organizing maps. In Proceedings of the 2015 17th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). 468--475. Google ScholarDigital Library
- Abdelghani Dahou, Shengwu Xiong, Junwei Zhou, Mohamed Houcine Haddoud, and Pengfei Duan. 2016. Word embeddings and convolutional neural network for Arabic sentiment classification. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. 2418--2427.Google Scholar
- Samhaa R. El-Beltagy, Mona El Kalamawy, and Abu Bakr Soliman. 2017. NileTMRG at SemEval-2017 task 4: Arabic sentiment analysis. arXiv preprint arXiv:1710.08458 (2017).Google Scholar
- Mohammed Elrazzaz, Shady Elbassuoni, Khaled Shaban, and Chadi Helwe. 2017. Methodical evaluation of Arabic word embeddings. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vol. 2. 454--458.Google ScholarCross Ref
- Hady ElSahar and Samhaa R. El-Beltagy. 2015. Building large Arabic multi-domain resources for sentiment analysis. In Computational Linguistics and Intelligent Text Processing. 23--34.Google Scholar
- Nikos Engonopoulos, Angeliki Lazaridou, Georgios Paliouras, and Konstantinos Chandrinos. 2011. ELS: A word-level method for entity-level sentiment analysis. In Proceedings of the International Conference on Web Intelligence, Mining and Semantics. 1--9. Google ScholarDigital Library
- Nizar Y. Habash. 2010. Introduction to Arabic natural language processing. Synth. Lect. Hum. Lang. Technol. 3, 1 (2010), 1--187.Google ScholarCross Ref
- Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012), 1--18.Google Scholar
- Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014), 1--6.Google Scholar
- Diederik Kingma and Jimmy Ba. 2014. ADAM: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013), 1--9.Google Scholar
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. 3111--3119. Google ScholarDigital Library
- AL-Smadi Mohammad, Omar Qawasmeh, Mahmoud Al-Ayyoub, Yaser Jararweh, and Brij Gupta. 2018. Deep recurrent neural network vs. Dupport vector machine for aspect-based sentiment analysis of Arabic hotels reviews. J. Comput. Sci. 27 (2018), 386--393.Google ScholarCross Ref
- Saif M. Mohammad, Mohammad Salameh, and Svetlana Kiritchenko. 2016. How translation alters sentiment. J. Artif. Intell. Res. 55 (2016), 95--130. Google ScholarCross Ref
- Mahmoud Nabil, Mohamed Aly, and Amir Atiya. 2015. ASTD: Arabic sentiment tweets dataset. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2515--2519.Google ScholarCross Ref
- Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML'10). 807--814. Google ScholarDigital Library
- Alexis Amid Neme and Eric Laporte. 2013. Pattern-and-root inflectional morphology: The Arabic broken plural. Lang. Sci. 40 (2013), 221--250.Google ScholarCross Ref
- Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. ACL, 115--124. Google ScholarDigital Library
- Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.Google ScholarCross Ref
- Abdullateef M. Rabab’Ah, Mahmoud Al-Ayyoub, Yaser Jararweh, and Mohammed N. Al-Kabi. 2016. Evaluating sentiStrength for Arabic sentiment analysis. In International Conference on Computer Science and Information Technology. 1--6.Google Scholar
- Abu Bakr Soliman, Kareem Eissa, and Samhaa R. El-Beltagy. 2017. AraVec: A set of Arabic word embedding mdels for use in Arabic NLP. Procedia Comput. Sci. 117 (2017), 256--265.Google ScholarCross Ref
- Jin Wang, Zhongyuan Wang, Dawei Zhang, and Jun Yan. 2017. Combining knowledge with deep convolutional neural networks for short text classification. In Proceedings of IJCAI, Vol. 350. Google ScholarDigital Library
- Ainur Yessenalina, Yisong Yue, and Claire Cardie. 2010. Multi-level structured models for document-level sentiment classification. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. ACL, 1046--1056. Google ScholarDigital Library
- Mohamed A. Zahran, Ahmed Magooda, Ashraf Y. Mahgoub, Hazem Raafat, Mohsen Rashwan, and Amir Atyia. 2015. Word representations in vector space and their applications for Arabic. In Computational Linguistics and Intelligent Text Processing. 430--443.Google Scholar
Index Terms
- Multi-Channel Embedding Convolutional Neural Network Model for Arabic Sentiment Classification
Recommendations
Robust Arabic Text Categorization by Combining Convolutional and Recurrent Neural Networks
Text Categorization is an important task in the area of Natural Language Processing (NLP). Its goal is to learn a model that can accurately classify any textual document for a given language into one of a set of predefined categories. In the context of ...
Sentence Sentiment Classification Using Convolutional Neural Network in Myanmar Texts
IVSP '20: Proceedings of the 2020 2nd International Conference on Image, Video and Signal ProcessingThere are still few works on application of deep learning for Myanmar language. This paper presents an approach to use a convolutional neural network (CNN) model to classify sentence sentiment in Myanmar texts. A CNN model is constructed on the top of a ...
A Word-Character Convolutional Neural Network for Language-Agnostic Twitter Sentiment Analysis
ADCS '17: Proceedings of the 22nd Australasian Document Computing SymposiumConvolutional Neural Networks (CNN) have been widely used for text classification. Both word-based CNNs and character-based CNNs have shown good performance for Twitter sentiment classification. Most research on CNNs is towards English Twitter sentiment ...
Comments