skip to main content
10.1145/3209978.3209995acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

A Personal Privacy Preserving Framework: I Let You Know Who Can See What

Published:27 June 2018Publication History

ABSTRACT

The booming of social networks has given rise to a large volume of user-generated contents (UGCs), most of which are free and publicly available. A lot of users' personal aspects can be extracted from these UGCs to facilitate personalized applications as validated by many previous studies. Despite their value, UGCs can place users at high privacy risks, which thus far remains largely untapped. Privacy is defined as the individual's ability to control what information is disclosed, to whom, when and under what circumstances. As people and information both play significant roles, privacy has been elaborated as a boundary regulation process, where individuals regulate interaction with others by altering the openness degree of themselves to others. In this paper, we aim to reduce users' privacy risks on social networks by answering the question of Who Can See What. Towards this goal, we present a novel scheme, comprising of descriptive, predictive and prescriptive components. In particular, we first collect a set of posts and extract a group of privacy-oriented features to describe the posts. We then propose a novel taxonomy-guided multi-task learning model to predict which personal aspects are uncovered by the posts. Lastly, we construct standard guidelines by the user study with 400 users to regularize users' actions for preventing their privacy leakage. Extensive experiments on a real-world dataset well verified our scheme.

References

  1. Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, and W. Bruce Croft . 2017. Learning a Hierarchical Embedding Model for Personalized Product Search Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 645--654. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil . 2008. Convex multi-task feature learning. Machine Learning Vol. 73, 3 (2008), 243--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jing Bai, Ke Zhou, Guirong Xue, Hongyuan Zha, Gordon Sun, Belle Tseng, Zhaohui Zheng, and Yi Chang . 2009. Multi-task learning for learning to rank in web search The 24th ACM International Conference on Information and Knowledge Management. ACM, 1549--1552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Amir Beck and Marc Teboulle . 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences Vol. 2, 1 (2009), 183--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Joanna Asia Biega, Rishiraj Saha Roy, and Gerhard Weikum . 2017. Privacy through Solidarity: A User-Utility-Preserving Framework to Counter Profiling. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 675--684. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Aylin Caliskan Islam, Jonathan Walsh, and Rachel Greenstadt . 2014. Privacy Detective: Detecting Private Information and Collective Privacy Behavior in a Large Social Network. In Workshop on Privacy in the Electronic Society. 35--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Rich Caruana . 1997. Multitask learning. Machine learning Vol. 28, 1 (1997), 41--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chih-Chung Chang and Chih-Jen Lin . 2011. LIBSVM: A library for support vector machines. TIST Vol. 2, 3 (2011), 27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Zhiyong Cheng, Jialie Shen, and Steven C. H. Hoi . 2016. On Effective Personalized Music Retrieval by Exploring Online User Behaviors Proceedings of the International ACM SIGIR conference on Research and Development in Information Retrieval. 125--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhiyong Cheng, Jialie Shen, Lei Zhu, Mohan S. Kankanhalli, and Liqiang Nie . 2017. Exploiting Music Play Sequence for Music Recommendation Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI. 3654--3660. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Corinna Cortes and Vladimir Vapnik . 1995. Support-vector networks. Machine learning Vol. 20, 3 (1995), 273--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Munmun De Choudhury, Scott Counts, and Eric Horvitz . 2013. Major life changes and behavioral markers in social media: case of childbirth Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 1431--1442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Valerian J Derlega and Alan L Chaikin . 1977. Privacy and self-disclosure in social relationships. Journal of Social Issues Vol. 33, 3 (1977), 102--115.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jianping Fan, Yuli Gao, and Hangzai Luo . 2007 a. Hierarchical classification for automatic image annotation The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 111--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jianping Fan, Yuli Gao, and Hangzai Luo . 2007 b. Hierarchical classification for automatic image annotation Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 111--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hongliang Fei, Ruoyi Jiang, Yuhao Yang, Bo Luo, and Jun Huan . 2011. Content based social behavior prediction: a multi-task learning approach The ACM International Conference on Information and Knowledge Management. ACM, 995--1000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Fuli Feng, Liqiang Nie, Xiang Wang, Richang Hong, and Tat-Seng Chua . 2017. Computational social indicators: a case study of chinese university ranking The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 455--464. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Joseph L Fleiss, Jacob Cohen, and B. S Everitt . 1969. Large sample standard errors of kappa and weighted kappa. Psychological Bulletin Vol. 72, 5 (1969), 323--327.Google ScholarGoogle ScholarCross RefCross Ref
  19. Yoav Freund, Robert E Schapire, et almbox. . 1996. Experiments with a new boosting algorithm. In International Conference on Machine Learning, Vol. Vol. 96. ACM, 148--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Debasis Ganguly, Dwaipayan Roy, Mandar Mitra, and Gareth JF Jones . 2015. Word Embedding based Generalized Language Model for Information Retrieval The International ACM SIGIR Conference on Research and Development in Information Retrieval. 795--798. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shuguang Han, Daqing He, and Zhen Yue . 2014. Benchmarking the Privacy-Preserving People Search. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.Google ScholarGoogle Scholar
  22. Xiangnan He and Tat-Seng Chua . 2017. Neural Factorization Machines for Sparse Predictive Analytics Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 355--364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Roger A Horn and Charles R Johnson . 1991. Topics in matrix analysis. Cambridge University Presss, Cambridge Vol. 37 (1991), 39. Google ScholarGoogle ScholarCross RefCross Ref
  24. Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy . 2010. How much is too much? Privacy issues on Twitter. In Conference of International Communication Association, Singapore.Google ScholarGoogle Scholar
  25. Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy . 2014. Twitter: a content analysis of personal information. Information, Communication & Society Vol. 17, 7 (2014), 843--857.Google ScholarGoogle ScholarCross RefCross Ref
  26. Melinda L Korzaan and Katherine T Boswell . 2008. The influence of personality traits and information privacy concerns on behavioral intentions. Journal of Computer Information Systems Vol. 48, 4 (2008), 15--24.Google ScholarGoogle Scholar
  27. Abhishek Kumar and Hal Daumé III . 2012. Learning Task Grouping and Overlap in Multi-task Learning International Conference on Machine Learning. 1383--1390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J Richard Landis and Gary G Koch . 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159--174.Google ScholarGoogle Scholar
  29. Kun Liu and Evimaria Terzi . 2010. A framework for computing the privacy scores of users in online social networks. ACM Transactions on Knowledge Discovery from Data Vol. 5, 1 (2010), 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Huina Mao, Xin Shuai, and Apu Kapadia . 2011. Loose tweets: an analysis of privacy leaks on twitter Workshop on Privacy in the Electronic Society. ACM, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Frank McSherry and Ilya Mironov . 2009. Differentially private recommender systems: building privacy into the net The International ACN SIGKDD Conferences on Knowledge Discovery and Data Mining. 627--636. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013. Distributed representations of words and phrases and their compositionality NIPS. 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Tom M Mitchell . 1997. Machine learning. Burr Ridge, IL: McGraw Hill (1997). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sandra Petronio . 2012. Boundaries of privacy: Dialectics of disclosure. Suny Press.Google ScholarGoogle Scholar
  35. Lee Rainie, Sara Kiesler, Ruogu Kang, Mary Madden, Maeve Duggan, Stephanie Brown, and Laura Dabbish . 2013. Anonymity, privacy, and security online. Pew Research Center (2013).Google ScholarGoogle Scholar
  36. Manya Sleeper, Justin Cranshaw, Patrick Gage Kelley, Blase Ur, Alessandro Acquisti, Lorrie Faith Cranor, and Norman Sadeh . 2013. I read my Twitter the next morning and was astonished: A conversational perspective on Twitter regrets. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3277--3286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Xuemeng Song, Zhaoyan Ming, Liqiang Nie, Yi-Liang Zhao, and Tat-Seng Chua . 2016. Volunteerism Tendency Prediction via Harvesting Multiple Social Networks. ACM Transactions on Information System Vol. 34, 2 (2016), 10:1--10:27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Xuemeng Song, Liqiang Nie, Luming Zhang, Mohammad Akbari, and Tat-Seng Chua . 2015 a. Multiple social network learning and its application in volunteerism tendency prediction. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 213--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Xuemeng Song, Liqiang Nie, Luming Zhang, Maofu Liu, and Tat-Seng Chua . 2015 b. Interest inference via structure-constrained multi-source multi-task learning International Joint Conference on Artificial Intelligence. AAAI Press, 2371--2377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yi Song, Daniel Dahlmeier, and Stephane Bressan . 2014. Not So Unique in the Crowd: a Simple and Effective Algorithm for Anonymizing Location Data The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 19.Google ScholarGoogle Scholar
  41. Damiano Spina, Julio Gonzalo, and Enrique Amigó . 2014. Learning similarity functions for topic detection in online reputation monitoring The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 527--536. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Robert Tibshirani . 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996), 267--288.Google ScholarGoogle Scholar
  43. Asimina Vasalou, Alastair J Gill, Fadhila Mazanderani, Chrysanthi Papoutsi, and Adam Joinson . 2011. Privacy dictionary: A new resource for the automated content analysis of privacy. JASIST Vol. 62, 11 (2011), 2095--2105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Yulu Wang, Garrick Sherman, Jimmy Lin, and Miles Efron . 2015. Assessor Differences and User Preferences in Tweet Timeline Generation International ACM SIGIR Conference on Research and Development in Information Retrieval. 615--624. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Simon S Woo and Harsha Manjunatha . 2015. Empirical Data Analysis on User Privacy and Sentiment in Personal Blogs The International ACM SIGIR Conference on Research and Development in Information Retrieval.Google ScholarGoogle Scholar
  46. Sicong Zhang, Hui Yang, and Lisa Singh . 2014. Increased Information Leakage from Text. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 41--42.Google ScholarGoogle Scholar

Index Terms

  1. A Personal Privacy Preserving Framework: I Let You Know Who Can See What

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
        June 2018
        1509 pages
        ISBN:9781450356572
        DOI:10.1145/3209978

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 June 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGIR '18 Paper Acceptance Rate86of409submissions,21%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader