skip to main content
10.1145/3178876.3186157acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Public Access

Through a Gender Lens: Learning Usage Patterns of Emojis from Large-Scale Android Users

Published:23 April 2018Publication History

ABSTRACT

Based on a large data set of emoji using behavior collected from smartphone users over the world, this paper investigates gender-specific usage of emojis. We present various interesting findings that evidence a considerable difference in emoji usage by female and male users. Such a difference is significant not just in a statistical sense; it is sufficient for a machine learning algorithm to accurately infer the gender of a user purely based on the emojis used in their messages. In real world scenarios where gender inference is a necessity, models based on emojis have unique advantages over existing models that are based on textual or contextual information. Emojis not only provide language-independent indicators, but also alleviate the risk of leaking private user information through the analysis of text and metadata.

References

  1. Fabes Richard A and Martin Carol Lynn. 1991. Gender and age stereotypes of emotionality. Personality and Social Psychology Bulletin Vol. 17, 5 (1991), 532--540.Google ScholarGoogle ScholarCross RefCross Ref
  2. Steven L. Ablon, Daniel P. Brown, Edward J. Khantzian, and John E. Mack. 2013. Explorations in affect development and meaning. Routledge.Google ScholarGoogle Scholar
  3. Wei Ai, Xuan Lu, Xuanzhe Liu, Ning Wang, Gang Huang, and Qiaozhu Mei. 2017. Untangling emoji popularity through semantic embeddings Proceedings of the 11th International Conference on Weblogs and Social Media, ICWSM 2017. 2--11.Google ScholarGoogle Scholar
  4. Fisher Ronald Aylmer. 1925. Statistical methods for research workers. Genesis Publishing Pvt Ltd.Google ScholarGoogle Scholar
  5. Francesco Barbieri, Germán Kruszewski, Francesco Ronzano, and Horacio Saggion. 2016 a. How cosmopolitan are emojis?: Exploring emojis usage and meaning over different languages with distributional semantics. In Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016. 531--535. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Francesco Barbieri, Francesco Ronzano, and Horacio Saggion. 2016 b. What does this emoji mean? A vector space skip-gram model for Twitter emojis Proceedings of the 10th International Conference on Language Resources and Evaluation LREC 2016.Google ScholarGoogle Scholar
  7. John D. Burger, John C. Henderson, George Kim, and Guido Zarrella. 2011. Discriminating gender on Twitter. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011. 1301--1309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Morgane Ciot, Morgan Sonderegger, and Derek Ruths. 2013. Gender inference of Twitter users in non-English contexts Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013. 1136--1145.Google ScholarGoogle Scholar
  9. Cortes Corinna and Vapnik Vladimir. 1995. Support vector machine. Machine learning Vol. 20, 3 (1995), 273--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Henriette Cramer, Paloma de Juan, and Joel R. Tetreault. 2016. Sender-intended functions of emojis in US messaging Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services, MobileHCI 2016. 504--509. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Eli Dresner and Susan C Herring. 2010. Functions of the nonverbal in CMC: Emoticons and illocutionary force. Communication theory Vol. 20, 3 (2010), 249--268.Google ScholarGoogle Scholar
  12. Benjamin Van Durme. 2012. Streaming analysis of discourse participants. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012. 48--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hoerl Arthur E and Kennard Robert W. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics Vol. 12, 1 (1970), 55--67.Google ScholarGoogle ScholarCross RefCross Ref
  14. Pedregosa Fabian, Varoquaux Gaël, Gramfort Alexandre, Michel Vincent, Thirion Bertrand, Grisel Olivier, Blondel Mathieu, Prettenhofer Peter, Weiss Ron, Vincent Dubourg, et almbox.. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research Vol. 12, Oct (2011), 2825--2830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lucie Flekova, Jordan Carpenter, Salvatore Giorgi, Lyle H. Ungar, and Daniel Preotiuc-Pietro. 2016. Analyzing biases in human perception of user age and gender from text Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016.Google ScholarGoogle Scholar
  16. Friedman Jerome H. 2002. Stochastic gradient boosting. Computational Statistics & Data Analysis Vol. 38, 4 (2002), 367--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Stratis Ioannidis, Andrea Montanari, Udi Weinsberg, Smriti Bhagat, Nadia Fawaz, and Nina Taft. 2014. Privacy tradeoffs in predictive analytics. In ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2014. 57--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Briton Nancy J. and Judith A. Hall.. 1995. Beliefs about female and male nonverbal communication. Sex Roles Vol. 32, 1 (1995), 79--90.Google ScholarGoogle ScholarCross RefCross Ref
  19. Reed Philip J, Spiro Emma S, and Butts Carter T. 2016. Thumbs up for privacy?: Differences in online self-disclosure behavior across national cultures. Social Science Research Vol. 59 (2016), 155--170.Google ScholarGoogle ScholarCross RefCross Ref
  20. Bernard J. Jansen and Lauren Solomon. 2010. Gender demographic targeting in sponsored search. In Proceedings of the 28th International Conference on Human Factors in Computing Systems, CHI 2010. 831--840. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Anders Johannsen, Dirk Hovy, and Anders Søgaard. 2015. Cross-lingual syntactic variation over age and gender Proceedings of the 19th Conference on Computational Natural Language Learning, CoNLL 2015. 103--112.Google ScholarGoogle Scholar
  22. David Jurgens, Yulia Tsvetkov, and Dan Jurafsky. 2017. Writer Profiling Without the Writer's Text. In Social Informatics - 9th International Conference, SocInfo 2017, Proceedings, Part II. 537--558.Google ScholarGoogle Scholar
  23. Fariba Karimi, Claudia Wagner, Florian Lemmerich, Mohsen Jadidi, and Markus Strohmaier. 2016. Inferring gender from names on the web: A comparative evaluation of gender detection methods. In Proceedings of the 25th International Conference on World Wide Web, WWW 2016. 53--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Eugene Kharitonov and Pavel Serdyukov. 2012. Gender-aware re-ranking. In Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2012. 1081--1082. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Peter Krátky and Daniela Chudá. 2016. Estimating gender and age of web page visitors from the way they use their mouse Proceedings of the 25th International Conference on World Wide Web, WWW 2016. 61--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Breiman Leo. 2001. Random forests. Machine learning Vol. 45, 1 (2001), 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Cheng Li, Yue Lu, Qiaozhu Mei, Dong Wang, and Sandeep Pandey. 2015. Click-through Prediction for Advertising in Twitter Timeline Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2015. 1959--1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Xuan Lu, Wei Ai, Xuanzhe Liu, Qian Li, Ning Wang, Gang Huang, and Qiaozhu Mei. 2016. Learning from the ubiquitous language: An empirical analysis of emoji usage of smartphone users. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp 2016. 770--780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. LaFrance Marianne and Banaji Mahzarin. 1992. Toward a reconsideration of the gender-emotion relationship. Emotion and Social Behavior Vol. 14 (1992), 178--201.Google ScholarGoogle Scholar
  30. Hannah Miller, Daniel Kluver, Jacob Thebault-Spieker, Loren Terveen, and Brent Hecht. 2017. Understanding emoji ambiguity in context: The role of text in emoji-related miscommunication. In Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017. 152--161.Google ScholarGoogle Scholar
  31. Hannah Miller, Jacob Thebault-Spieker, Shuo Chang, Isaac L. Johnson, Loren G. Terveen, and Brent Hecht. 2016. “Blissfully happy" or “ready to fight": Varying interpretations of emoji Proceedings of the 10th International Conference on Web and Social Media, ICWSM 2016. 259--268.Google ScholarGoogle Scholar
  32. Petra Kralj Novak, Jasmina Smailovic, Borut Sluban, and Igor Mozetic. 2015. Sentiment of emojis. PloS One Vol. 10, 12 (2015).Google ScholarGoogle Scholar
  33. Balswick Jack O and Peek Charles W. 1971. The inexpressive male: A tragedy of American society. Family Coordinator (1971), 363--368.Google ScholarGoogle Scholar
  34. Jahna Otterbacher, Jo Bates, and Paul D. Clough. 2017. Competent men and warm women: Gender stereotypes and backlash in image search results. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, CHI 2017. 6620--6631. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Henning Pohl, Christian Domin, and Michael Rohs. 2017. Beyond just text: semantic emoji similarity modeling to support expressive communication oji549 oji684 oji830. ACM Transactions on Computer-Human Interaction (TOCHI) Vol. 24, 1 (2017), 6:1--6:42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Henning Pohl, Dennis Stanke, and Michael Rohs. 2016. EmojiZoom: emoji entry via large overview maps. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services, MobileHCI 2016. 510--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Daniel Preotiuc-Pietro, Wei Xu, and Lyle H. Ungar. 2016. Discovering user attribute stylistic differences via paraphrasing Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI 2016. 3030--3037. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wilkins Richard and Gareis Elisabeth. 2006. Emotion expression and the locution “I love you”: A cross-cultural study. International Journal of Intercultural Relations Vol. 30, 1 (2006), 51--75.Google ScholarGoogle ScholarCross RefCross Ref
  39. Buck Ross, Miller Robert E, and Caul William F. 1974. Sex, personality, and physiological variables in the communication of affect via facial expression. Journal of Personality and Social Psychology Vol. 30, 4 (1974), 587.Google ScholarGoogle ScholarCross RefCross Ref
  40. Buck Ross, Baron Reuben M, Goodman Nancy, and Shapiro Beth. 1980. Unitization of spontaneous nonverbal behavior in the study of emotion communication. Journal of Personality and Social Psychology Vol. 39, 3 (1980), 522--529.Google ScholarGoogle ScholarCross RefCross Ref
  41. Buck Ross, Baron Reuben, and Barrette Dana. 1982. Temporal organization of spontaneous emotional expression: A segmentation analysis. Journal of Personality and Social Psychology Vol. 42, 3 (1982), 506--517.Google ScholarGoogle ScholarCross RefCross Ref
  42. Kelly Ryan and Leon Watts. 2015. Characterising the inventive appropriation of emoji as relationally meaningful in mediated close personal relationships. Experiences of Technology Appropriation: Unanticipated Users, Usage, Circumstances, and Design (2015).Google ScholarGoogle Scholar
  43. Maarten Sap, Gregory J. Park, Johannes C. Eichstaedt, Margaret L. Kern, David Stillwell, Michal Kosinski, Lyle H. Ungar, and H. Andrew Schwartz. 2014. Developing age and gender predictive lexica over social media Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014. 1146--1151.Google ScholarGoogle Scholar
  44. Hwang Ha Sung. 2014. Gender differences in emoticon use on mobile text messaging: evidence from a Korean sample. International Journal of Journalism & Mass Communication Vol. 2014 (2014).Google ScholarGoogle Scholar
  45. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, WWW 2015. 1067--1077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Channary Tauch and Eiman Kanjo. 2016. The roles of emojis in mobile phone notifications. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp Adjunct 2016. 1560--1565. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Hu Tianran, Guo Han, Sun Hao, Nguyen Thuy vy Thi, and Luo Jiebo. 2017. Spice up your chat: The intentions and sentiment effects of using emoji Proceedings of the 11th International Conference on Weblogs and Social Media, ICWSM 2017. 102--111.Google ScholarGoogle Scholar
  48. Chad Tossell, Philip T. Kortum, Clayton Shepard, Laura H. Barg-Walkow, Ahmad Rahmati, and Lin Zhong. 2012. A longitudinal study of emoticon use in text messaging from smartphones. Computers in Human Behavior Vol. 28, 2 (2012), 659--663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Svitlana Volkova and Yoram Bachrach. 2016. Inferring perceived demographics from user emotional tone and user-environment emotional contrast. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016.Google ScholarGoogle ScholarCross RefCross Ref
  50. Dunnett Charles W. 1955. A multiple comparison procedure for comparing several treatments with a control. J. Amer. Statist. Assoc. Vol. 50, 272 (1955), 1096--1121.Google ScholarGoogle ScholarCross RefCross Ref
  51. Tigwell Garreth W and Flatla David R. 2016. Oh that's what you meant!: Reducing emoji misunderstanding Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct, MobileHCI Adjunct 2016. ACM, 859--866. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Yi-Chia Wang, Moira Burke, and Robert E. Kraut. 2013. Gender, topic, and audience response: an analysis of user-generated content on facebook 2013 ACM SIGCHI Conference on Human Factors in Computing Systems, CHI 2013. 31--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Church Kenneth Ward and Hanks Patrick. 1990. Word association norms, mutual information, and lexicography. Computational Linguistics Vol. 16, 1 (1990), 22--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Alecia Wolf. 2000. Emotional expression online: Gender differences in emoticon use. Cyberpsy., Behavior, and Soc. Networking Vol. 3, 5 (2000), 827--833.Google ScholarGoogle Scholar
  55. Quanzeng You, Sumit Bhatia, Tong Sun, and Jiebo Luo. 2014. The eyes of the beholder: Gender prediction using images posted in online social networks 2014 IEEE International Conference on Data Mining Workshops, ICDM Workshops 2014. 1026--1030.Google ScholarGoogle Scholar
  56. Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. 2012. Homophily and latent attribute inference: Inferring latent attributes of Twitter users from neighbors. In Proceedings of the Sixth International Conference on Weblogs and Social Media, ICWSM 2012.Google ScholarGoogle Scholar
  57. Rui Zhou, Jasmine Hentschel, and Neha Kumar. 2017. Goodbye text, hello emoji: Mobile communication on WeChat in China Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, CHI 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Through a Gender Lens: Learning Usage Patterns of Emojis from Large-Scale Android Users

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          WWW '18: Proceedings of the 2018 World Wide Web Conference
          April 2018
          2000 pages
          ISBN:9781450356398

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          International World Wide Web Conferences Steering Committee

          Republic and Canton of Geneva, Switzerland

          Publication History

          • Published: 23 April 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          WWW '18 Paper Acceptance Rate170of1,155submissions,15%Overall Acceptance Rate1,899of8,196submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format