Editorial Notes
"PRISM: Profession Identification in Social Media" by C. Tu, Z. Liu, H. Luan, and M. Sun, ACM TIST, Vol 8, Issue 6, Nov 2017, https://doi.org/10.1145/3070665, is an extension of "PRISM: Profession Identification in Social Media with Personal Information and Community Structure" by C. Tu, Z. Liu, and M. Sun, Social Media Processing Proceedings, SMP 2015, Communications in Computer and Information Science, 4th National Conference, Nov 2015, Springer, DOI: 10.1007/978-981-10-0080-5_2.
Abstract
Profession is an important social attribute of people. It plays a crucial role in commercial services such as personalized recommendation and targeted advertising. In practice, profession information is usually unavailable due to privacy and other reasons. In this article, we explore the task of identifying user professions according to their behaviors in social media. The task confronts the following challenges that make it non-trivial: how to incorporate heterogeneous information of user behaviors, how to effectively utilize both labeled and unlabeled data, and how to exploit community structure. To address these challenges, we present a framework called Profession Identification in Social Media. It takes advantage of both personal information and community structure of users in the following aspects: (1) We present a cascaded two-level classifier with heterogeneous personal features to measure the confidence of users belonging to different professions. (2) We present a multi-training process to take advantages of both labeled and unlabeled data to enhance classification performance. (3) We design a profession identification method synthetically considering the confidences from personal features and community structure. We collect a real-world dataset to conduct experiments, and experimental results demonstrate the significant effectiveness of our method compared with other baseline methods. By applying prediction on large-scale users, we also analyze characteristics of microblog users, finding that there are significant diversities among users of different professions in demographics, social network structures, and linguistic styles.
- Demetris Antoniades, Iasonas Polakis, Georgios Kontaxis, Elias Athanasopoulos, Sotiris Ioannidis, Evangelos P. Markatos, and Thomas Karagiannis. 2011. We.b: The web of short URLs. In Proc. WWW. 715--724.Google ScholarDigital Library
- John D. Burger, John Henderson, George Kim, and Guido Zarrella. 2011. Discriminating gender on twitter. In Proc. EMNLP. 1301--1309.Google Scholar
- Janet Saltzman Chafetz. 1988. The gender division of labor and the reproduction of female disadvantage toward an integrated theory. J. Family Iss. 9, 1 (1988), 108--131.Google ScholarCross Ref
- Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3 (2011), 27.Google ScholarDigital Library
- Gaurish Chaudhari, Vashist Avadhanula, and Sunita Sarawagi. 2014. A few good predictions: Selective node labeling in a social network. In Proc. WSDM. 353--362. Google ScholarDigital Library
- Cristian Danescu-Niculescu-Mizil, Lillian Lee, Bo Pang, and Jon Kleinberg. 2012. Echoes of power: Language effects and power differences in social interaction. In Proc. WWW. 699--708.Google ScholarDigital Library
- Peter Sheridan Dodds, Kameron Decker Harris, Isabel M. Kloumann, Catherine A. Bliss, and Christopher M. Danforth. 2011. Temporal patterns of happiness and information in a global social network: Hedonometrics and twitter. PLoS ONE 6, 12 (2011), e26752.Google ScholarCross Ref
- A. Evgeniou and Massimiliano Pontil. 2007. Multi-task feature learning. In Proc. NIPS, Vol. 19. 41.Google Scholar
- Theodoros Evgeniou and Massimiliano Pontil. 2004. Regularized multi--task learning. In Proc. KDD. 109--117. Google ScholarDigital Library
- Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. J. Mach. Learn. Res. 9 (Aug. 2008), 1871--1874.Google Scholar
- Wei Feng and Jianyong Wang. 2012. Incorporating heterogeneous information for personalized tag recommendation in social tagging systems. In Proc. KDD. 1276--1284. Google ScholarDigital Library
- Clayton Fink, Jonathon Kopecky, and Maksym Morawski. 2012. Inferring gender from the content of tweets: A region specific example. In Proc. ICWSM.Google Scholar
- George Forman. 2003. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3 (March 2003), 1289--1305.Google ScholarDigital Library
- Jennifer Golbeck, Cristina Robles, and Karen Turner. 2011. Predicting personality with social media. In Proc. CHI. 253--262. Google ScholarDigital Library
- Sumit Goswami, Sudeshna Sarkar, and Mayur Rustagi. 2009. Stylometric analysis of bloggers’ age and gender. In Proc. ICWSM.Google ScholarCross Ref
- John L. Holland. 1997. Making Vocational Choices: A Theory of Vocational Personalities and Work Environments. Psychological Assessment Resources.Google Scholar
- Yann Jacob, Ludovic Denoyer, and Patrick Gallinari. 2014. Learning latent representations of nodes for classifying in heterogeneous social networks. In Proc. WSDM. 373--382. Google ScholarDigital Library
- Xiangnan Kong, Bokai Cao, and Philip S. Yu. 2013. Multi-label classification by mining label and instance correlations from heterogeneous information networks. In Proc. KDD. 614--622. Google ScholarDigital Library
- Michal Kosinski, David Stillwell, and Thore Graepel. 2013. Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. U.S.A. 110, 15 (2013), 5802--5805. Google ScholarCross Ref
- David Lazer, Alex Sandy Pentland, Lada Adamic, Sinan Aral, Albert Laszlo Barabasi, Devon Brewer, Nicholas Christakis, Noshir Contractor, James Fowler, Myron Gutmann, and others. 2009. Life in the network: The coming age of computational social science. Science 323, 5915 (2009), 721.Google Scholar
- Kevin Lewis, Marco Gonzalez, and Jason Kaufman. 2012. Social selection and peer influence in an online social network. Proc. Natl. Acad. Sci. U.S.A. 109, 1 (2012), 68--72. Google ScholarCross Ref
- Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, and Kevin Chen-Chuan Chang. 2012. Towards social user profiling: Unified and discriminative influence model for inferring home locations. In Proc. KDD. 1023--1031. Google ScholarDigital Library
- Zhiyuan Liu, Cunchao Tu, and Maosong Sun. 2012. Tag dispatch model with social network regularization for microblog user tag suggestion. In Proc. COLING.Google Scholar
- François Mairesse, Marilyn A. Walker, Matthias R. Mehl, and Roger K. Moore. 2007. Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30 (2007), 457--500.Google ScholarCross Ref
- Miller McPherson, Lynn Smith-Lovin, and James M. Cook. 2001. Birds of a feather: Homophily in social networks. Ann. Rev. Sociol. (2001), 415--444.Google Scholar
- Alan Mislove, Sune Lehmann, Yong-Yeol Ahn, Jukka-Pekka Onnela, and J. Niels Rosenquist. 2011. Understanding the demographics of twitter users. In Proc. ICWSM.Google Scholar
- Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in online social networks. In Proc. WSDM. 251--260. Google ScholarDigital Library
- Mark E. J. Newman. 2006. Modularity and community structure in networks. Proc. Natl. Acad. Sci. U.S.A. 103, 23 (2006), 8577--8582. Google ScholarCross Ref
- Kate G. Niederhoffer and James W. Pennebaker. 2002. Linguistic style matching in social interaction. J. Lang. Soc. Psychol. 21, 4 (2002), 337--360. Google ScholarCross Ref
- Delip Rao, David Yarowsky, Abhishek Shreevats, and Manaswi Gupta. 2010. Classifying latent user attributes in twitter. In Proceedings of Workshop on Search and Mining User-Generated Contents. 37--44. Google ScholarDigital Library
- Robert A. Rothman. 1987. Working: Sociological Perspectives. Prentice-Hall Englewood Cliffs, NJ.Google Scholar
- Mrinmaya Sachan, Avinava Dubey, Shashank Srivastava, Eric P. Xing, and Eduard Hovy. 2014. Spatial compactness meets topical consistency: Jointly modeling links and content for community detection. In Proc. WSDM. 503--512. Google ScholarDigital Library
- H. Andrew Schwartz, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie M. Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, Martin E. P. Seligman, and others. 2013. Personality, gender, and age in the language of social media: The open-vocabulary approach. PLoS ONE 8, 9 (2013), e73791.Google ScholarCross Ref
- Cunchao Tu, Zhiyuan Liu, and Maosong Sun. 2014. Inferring correspondences from multiple sources for microblog user tags. In Chinese National Conference on Social Media Processing. Springer, 1--12. Google ScholarCross Ref
- Cunchao Tu, Hao Wang, Xiangkai Zeng, Zhiyuan Liu, and Maosong Sun. 2016a. Community-enhanced network representation learning for network analysis. arXiv preprint arXiv:1611.06645 (2016).Google Scholar
- Cunchao Tu, Weicheng Zhang, Zhiyuan Liu, and Maosong Sun. 2016b. Max-margin deepwalk: Discriminative learning of network representation. In Proc. IJCAI. 3889--3895.Google Scholar
- Rudi Volti. 2011. An Introduction to the Sociology of Work and Occupations. Pine Forge Press.Google Scholar
- Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, and Edward Y. Chang. 2015. Network representation learning with rich text information. In Proc. IJCAI. 2111--2117.Google ScholarDigital Library
- Shuang-Hong Yang, Bo Long, Alex Smola, Narayanan Sadagopan, Zhaohui Zheng, and Hongyuan Zha. 2011. Like like alike: Joint friendship and interest propagation in social networks. In Proc. WWW. 537--546.Google ScholarDigital Library
- Yiming Yang and Jan O. Pedersen. 1997. A comparative study on feature selection in text categorization. In Proc. ICML, Vol. 97. 412--420.Google Scholar
- Xiaojin Zhu and Andrew B. Goldberg. 2009. Introduction to semi-supervised learning. Synth. Lect. Artif. Intell. Mach. Learn. 3, 1 (2009), 1--130. Google ScholarCross Ref
Index Terms
- PRISM: Profession Identification in Social Media
Recommendations
Discovering Overlapping Groups in Social Media
ICDM '10: Proceedings of the 2010 IEEE International Conference on Data MiningThe increasing popularity of social media is shortening the distance between people. Social activities, e.g., tagging in Flickr, book marking in Delicious, twittering in Twitter, etc. are reshaping people’s social life and redefining their social roles. ...
Uses and gratifications of social networking sites for bridging and bonding social capital
Applying uses and gratifications theory (UGT) and social capital theory, our study examined users of four social networking sites (SNSs) (Facebook, Twitter, Instagram, and Snapchat), and their influence on online bridging and bonding social capital. ...
Gratifications of using Facebook, Twitter, Instagram, or Snapchat to follow brands
Snapchat is used for passing time, sharing problems, and social knowledge.Instagram is used for showing affection, following fashion, and sociability.Twitter users had highest brand community identification and membership intention.Instagram users had ...
Comments