ABSTRACT
Based on a large data set of emoji using behavior collected from smartphone users over the world, this paper investigates gender-specific usage of emojis. We present various interesting findings that evidence a considerable difference in emoji usage by female and male users. Such a difference is significant not just in a statistical sense; it is sufficient for a machine learning algorithm to accurately infer the gender of a user purely based on the emojis used in their messages. In real world scenarios where gender inference is a necessity, models based on emojis have unique advantages over existing models that are based on textual or contextual information. Emojis not only provide language-independent indicators, but also alleviate the risk of leaking private user information through the analysis of text and metadata.
- Fabes Richard A and Martin Carol Lynn. 1991. Gender and age stereotypes of emotionality. Personality and Social Psychology Bulletin Vol. 17, 5 (1991), 532--540.Google ScholarCross Ref
- Steven L. Ablon, Daniel P. Brown, Edward J. Khantzian, and John E. Mack. 2013. Explorations in affect development and meaning. Routledge.Google Scholar
- Wei Ai, Xuan Lu, Xuanzhe Liu, Ning Wang, Gang Huang, and Qiaozhu Mei. 2017. Untangling emoji popularity through semantic embeddings Proceedings of the 11th International Conference on Weblogs and Social Media, ICWSM 2017. 2--11.Google Scholar
- Fisher Ronald Aylmer. 1925. Statistical methods for research workers. Genesis Publishing Pvt Ltd.Google Scholar
- Francesco Barbieri, Germán Kruszewski, Francesco Ronzano, and Horacio Saggion. 2016 a. How cosmopolitan are emojis?: Exploring emojis usage and meaning over different languages with distributional semantics. In Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016. 531--535. Google ScholarDigital Library
- Francesco Barbieri, Francesco Ronzano, and Horacio Saggion. 2016 b. What does this emoji mean? A vector space skip-gram model for Twitter emojis Proceedings of the 10th International Conference on Language Resources and Evaluation LREC 2016.Google Scholar
- John D. Burger, John C. Henderson, George Kim, and Guido Zarrella. 2011. Discriminating gender on Twitter. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011. 1301--1309. Google ScholarDigital Library
- Morgane Ciot, Morgan Sonderegger, and Derek Ruths. 2013. Gender inference of Twitter users in non-English contexts Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013. 1136--1145.Google Scholar
- Cortes Corinna and Vapnik Vladimir. 1995. Support vector machine. Machine learning Vol. 20, 3 (1995), 273--297. Google ScholarDigital Library
- Henriette Cramer, Paloma de Juan, and Joel R. Tetreault. 2016. Sender-intended functions of emojis in US messaging Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services, MobileHCI 2016. 504--509. Google ScholarDigital Library
- Eli Dresner and Susan C Herring. 2010. Functions of the nonverbal in CMC: Emoticons and illocutionary force. Communication theory Vol. 20, 3 (2010), 249--268.Google Scholar
- Benjamin Van Durme. 2012. Streaming analysis of discourse participants. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012. 48--58. Google ScholarDigital Library
- Hoerl Arthur E and Kennard Robert W. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics Vol. 12, 1 (1970), 55--67.Google ScholarCross Ref
- Pedregosa Fabian, Varoquaux Gaël, Gramfort Alexandre, Michel Vincent, Thirion Bertrand, Grisel Olivier, Blondel Mathieu, Prettenhofer Peter, Weiss Ron, Vincent Dubourg, et almbox.. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research Vol. 12, Oct (2011), 2825--2830. Google ScholarDigital Library
- Lucie Flekova, Jordan Carpenter, Salvatore Giorgi, Lyle H. Ungar, and Daniel Preotiuc-Pietro. 2016. Analyzing biases in human perception of user age and gender from text Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016.Google Scholar
- Friedman Jerome H. 2002. Stochastic gradient boosting. Computational Statistics & Data Analysis Vol. 38, 4 (2002), 367--378. Google ScholarDigital Library
- Stratis Ioannidis, Andrea Montanari, Udi Weinsberg, Smriti Bhagat, Nadia Fawaz, and Nina Taft. 2014. Privacy tradeoffs in predictive analytics. In ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2014. 57--69. Google ScholarDigital Library
- Briton Nancy J. and Judith A. Hall.. 1995. Beliefs about female and male nonverbal communication. Sex Roles Vol. 32, 1 (1995), 79--90.Google ScholarCross Ref
- Reed Philip J, Spiro Emma S, and Butts Carter T. 2016. Thumbs up for privacy?: Differences in online self-disclosure behavior across national cultures. Social Science Research Vol. 59 (2016), 155--170.Google ScholarCross Ref
- Bernard J. Jansen and Lauren Solomon. 2010. Gender demographic targeting in sponsored search. In Proceedings of the 28th International Conference on Human Factors in Computing Systems, CHI 2010. 831--840. Google ScholarDigital Library
- Anders Johannsen, Dirk Hovy, and Anders Søgaard. 2015. Cross-lingual syntactic variation over age and gender Proceedings of the 19th Conference on Computational Natural Language Learning, CoNLL 2015. 103--112.Google Scholar
- David Jurgens, Yulia Tsvetkov, and Dan Jurafsky. 2017. Writer Profiling Without the Writer's Text. In Social Informatics - 9th International Conference, SocInfo 2017, Proceedings, Part II. 537--558.Google Scholar
- Fariba Karimi, Claudia Wagner, Florian Lemmerich, Mohsen Jadidi, and Markus Strohmaier. 2016. Inferring gender from names on the web: A comparative evaluation of gender detection methods. In Proceedings of the 25th International Conference on World Wide Web, WWW 2016. 53--54. Google ScholarDigital Library
- Eugene Kharitonov and Pavel Serdyukov. 2012. Gender-aware re-ranking. In Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2012. 1081--1082. Google ScholarDigital Library
- Peter Krátky and Daniela Chudá. 2016. Estimating gender and age of web page visitors from the way they use their mouse Proceedings of the 25th International Conference on World Wide Web, WWW 2016. 61--62. Google ScholarDigital Library
- Breiman Leo. 2001. Random forests. Machine learning Vol. 45, 1 (2001), 5--32. Google ScholarDigital Library
- Cheng Li, Yue Lu, Qiaozhu Mei, Dong Wang, and Sandeep Pandey. 2015. Click-through Prediction for Advertising in Twitter Timeline Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2015. 1959--1968. Google ScholarDigital Library
- Xuan Lu, Wei Ai, Xuanzhe Liu, Qian Li, Ning Wang, Gang Huang, and Qiaozhu Mei. 2016. Learning from the ubiquitous language: An empirical analysis of emoji usage of smartphone users. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp 2016. 770--780. Google ScholarDigital Library
- LaFrance Marianne and Banaji Mahzarin. 1992. Toward a reconsideration of the gender-emotion relationship. Emotion and Social Behavior Vol. 14 (1992), 178--201.Google Scholar
- Hannah Miller, Daniel Kluver, Jacob Thebault-Spieker, Loren Terveen, and Brent Hecht. 2017. Understanding emoji ambiguity in context: The role of text in emoji-related miscommunication. In Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017. 152--161.Google Scholar
- Hannah Miller, Jacob Thebault-Spieker, Shuo Chang, Isaac L. Johnson, Loren G. Terveen, and Brent Hecht. 2016. “Blissfully happy" or “ready to fight": Varying interpretations of emoji Proceedings of the 10th International Conference on Web and Social Media, ICWSM 2016. 259--268.Google Scholar
- Petra Kralj Novak, Jasmina Smailovic, Borut Sluban, and Igor Mozetic. 2015. Sentiment of emojis. PloS One Vol. 10, 12 (2015).Google Scholar
- Balswick Jack O and Peek Charles W. 1971. The inexpressive male: A tragedy of American society. Family Coordinator (1971), 363--368.Google Scholar
- Jahna Otterbacher, Jo Bates, and Paul D. Clough. 2017. Competent men and warm women: Gender stereotypes and backlash in image search results. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, CHI 2017. 6620--6631. Google ScholarDigital Library
- Henning Pohl, Christian Domin, and Michael Rohs. 2017. Beyond just text: semantic emoji similarity modeling to support expressive communication oji549 oji684 oji830. ACM Transactions on Computer-Human Interaction (TOCHI) Vol. 24, 1 (2017), 6:1--6:42. Google ScholarDigital Library
- Henning Pohl, Dennis Stanke, and Michael Rohs. 2016. EmojiZoom: emoji entry via large overview maps. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services, MobileHCI 2016. 510--517. Google ScholarDigital Library
- Daniel Preotiuc-Pietro, Wei Xu, and Lyle H. Ungar. 2016. Discovering user attribute stylistic differences via paraphrasing Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI 2016. 3030--3037. Google ScholarDigital Library
- Wilkins Richard and Gareis Elisabeth. 2006. Emotion expression and the locution “I love you”: A cross-cultural study. International Journal of Intercultural Relations Vol. 30, 1 (2006), 51--75.Google ScholarCross Ref
- Buck Ross, Miller Robert E, and Caul William F. 1974. Sex, personality, and physiological variables in the communication of affect via facial expression. Journal of Personality and Social Psychology Vol. 30, 4 (1974), 587.Google ScholarCross Ref
- Buck Ross, Baron Reuben M, Goodman Nancy, and Shapiro Beth. 1980. Unitization of spontaneous nonverbal behavior in the study of emotion communication. Journal of Personality and Social Psychology Vol. 39, 3 (1980), 522--529.Google ScholarCross Ref
- Buck Ross, Baron Reuben, and Barrette Dana. 1982. Temporal organization of spontaneous emotional expression: A segmentation analysis. Journal of Personality and Social Psychology Vol. 42, 3 (1982), 506--517.Google ScholarCross Ref
- Kelly Ryan and Leon Watts. 2015. Characterising the inventive appropriation of emoji as relationally meaningful in mediated close personal relationships. Experiences of Technology Appropriation: Unanticipated Users, Usage, Circumstances, and Design (2015).Google Scholar
- Maarten Sap, Gregory J. Park, Johannes C. Eichstaedt, Margaret L. Kern, David Stillwell, Michal Kosinski, Lyle H. Ungar, and H. Andrew Schwartz. 2014. Developing age and gender predictive lexica over social media Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014. 1146--1151.Google Scholar
- Hwang Ha Sung. 2014. Gender differences in emoticon use on mobile text messaging: evidence from a Korean sample. International Journal of Journalism & Mass Communication Vol. 2014 (2014).Google Scholar
- Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, WWW 2015. 1067--1077. Google ScholarDigital Library
- Channary Tauch and Eiman Kanjo. 2016. The roles of emojis in mobile phone notifications. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp Adjunct 2016. 1560--1565. Google ScholarDigital Library
- Hu Tianran, Guo Han, Sun Hao, Nguyen Thuy vy Thi, and Luo Jiebo. 2017. Spice up your chat: The intentions and sentiment effects of using emoji Proceedings of the 11th International Conference on Weblogs and Social Media, ICWSM 2017. 102--111.Google Scholar
- Chad Tossell, Philip T. Kortum, Clayton Shepard, Laura H. Barg-Walkow, Ahmad Rahmati, and Lin Zhong. 2012. A longitudinal study of emoticon use in text messaging from smartphones. Computers in Human Behavior Vol. 28, 2 (2012), 659--663. Google ScholarDigital Library
- Svitlana Volkova and Yoram Bachrach. 2016. Inferring perceived demographics from user emotional tone and user-environment emotional contrast. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016.Google ScholarCross Ref
- Dunnett Charles W. 1955. A multiple comparison procedure for comparing several treatments with a control. J. Amer. Statist. Assoc. Vol. 50, 272 (1955), 1096--1121.Google ScholarCross Ref
- Tigwell Garreth W and Flatla David R. 2016. Oh that's what you meant!: Reducing emoji misunderstanding Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct, MobileHCI Adjunct 2016. ACM, 859--866. Google ScholarDigital Library
- Yi-Chia Wang, Moira Burke, and Robert E. Kraut. 2013. Gender, topic, and audience response: an analysis of user-generated content on facebook 2013 ACM SIGCHI Conference on Human Factors in Computing Systems, CHI 2013. 31--34. Google ScholarDigital Library
- Church Kenneth Ward and Hanks Patrick. 1990. Word association norms, mutual information, and lexicography. Computational Linguistics Vol. 16, 1 (1990), 22--29. Google ScholarDigital Library
- Alecia Wolf. 2000. Emotional expression online: Gender differences in emoticon use. Cyberpsy., Behavior, and Soc. Networking Vol. 3, 5 (2000), 827--833.Google Scholar
- Quanzeng You, Sumit Bhatia, Tong Sun, and Jiebo Luo. 2014. The eyes of the beholder: Gender prediction using images posted in online social networks 2014 IEEE International Conference on Data Mining Workshops, ICDM Workshops 2014. 1026--1030.Google Scholar
- Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. 2012. Homophily and latent attribute inference: Inferring latent attributes of Twitter users from neighbors. In Proceedings of the Sixth International Conference on Weblogs and Social Media, ICWSM 2012.Google Scholar
- Rui Zhou, Jasmine Hentschel, and Neha Kumar. 2017. Goodbye text, hello emoji: Mobile communication on WeChat in China Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, CHI 2017. Google ScholarDigital Library
Index Terms
- Through a Gender Lens: Learning Usage Patterns of Emojis from Large-Scale Android Users
Recommendations
Do all facial emojis communicate emotion? The impact of facial emojis on perceived sender emotion and text processing
AbstractFacial emojis can express a variety of positive and negative emotions, and are commonly used in digital, written communication. However, little is known about how emojis impact text processing and how different emoji-text combinations ...
Highlights- We investigated how facial emojis impact text comprehension using behavioral ratings and EEG.
Emojis influence emotional communication, social attributions, and information processing
AbstractMany emojis symbolize nonverbal cues that are used during face-to-face communication. Despite their popularity, few studies have examined how emojis influence digital interactions. The present study addresses this gap by measuring the ...
Highlights- Emojis convey information about the sender's affect.
- Senders that use positive ...
Exploring the Effect of Motion Type and Emotions on the Perception of Gender in Virtual Humans
In this article, we investigate the perception of gender from the motion of virtual humans under different emotional conditions and explore the effect of emotional bias on gender perception (e.g., anger being attributed to males more than females). As ...
Comments