research-article

A Personal Privacy Preserving Framework: I Let You Know Who Can See What

Authors:
Xuemeng Song

Shandong University, Qingdao, China

Shandong University, Qingdao, China
View Profile

,
Xiang Wang

National Unversity of Singapore, Singapore, Singapore

National Unversity of Singapore, Singapore, Singapore
View Profile

,
Liqiang Nie

Shandong University, Qingdao, China

Shandong University, Qingdao, China
View Profile

,
Xiangnan He

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Zhumin Chen

Shandong University, Qingdao, China

Shandong University, Qingdao, China
View Profile

,
Wei Liu

Tencent AI Lab, Shenzhen, China

Tencent AI Lab, Shenzhen, China
View Profile

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalJune 2018Pages 295–304https://doi.org/10.1145/3209978.3209995

Published:27 June 2018Publication History

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Pages 295–304

ABSTRACT

The booming of social networks has given rise to a large volume of user-generated contents (UGCs), most of which are free and publicly available. A lot of users' personal aspects can be extracted from these UGCs to facilitate personalized applications as validated by many previous studies. Despite their value, UGCs can place users at high privacy risks, which thus far remains largely untapped. Privacy is defined as the individual's ability to control what information is disclosed, to whom, when and under what circumstances. As people and information both play significant roles, privacy has been elaborated as a boundary regulation process, where individuals regulate interaction with others by altering the openness degree of themselves to others. In this paper, we aim to reduce users' privacy risks on social networks by answering the question of Who Can See What. Towards this goal, we present a novel scheme, comprising of descriptive, predictive and prescriptive components. In particular, we first collect a set of posts and extract a group of privacy-oriented features to describe the posts. We then propose a novel taxonomy-guided multi-task learning model to predict which personal aspects are uncovered by the posts. Lastly, we construct standard guidelines by the user study with 400 users to regularize users' actions for preventing their privacy leakage. Extensive experiments on a real-world dataset well verified our scheme.

References

Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, and W. Bruce Croft . 2017. Learning a Hierarchical Embedding Model for Personalized Product Search Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 645--654. Google ScholarDigital Library
Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil . 2008. Convex multi-task feature learning. Machine Learning Vol. 73, 3 (2008), 243--272. Google ScholarDigital Library
Jing Bai, Ke Zhou, Guirong Xue, Hongyuan Zha, Gordon Sun, Belle Tseng, Zhaohui Zheng, and Yi Chang . 2009. Multi-task learning for learning to rank in web search The 24th ACM International Conference on Information and Knowledge Management. ACM, 1549--1552. Google ScholarDigital Library
Amir Beck and Marc Teboulle . 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences Vol. 2, 1 (2009), 183--202. Google ScholarDigital Library
Joanna Asia Biega, Rishiraj Saha Roy, and Gerhard Weikum . 2017. Privacy through Solidarity: A User-Utility-Preserving Framework to Counter Profiling. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 675--684. Google ScholarDigital Library
Aylin Caliskan Islam, Jonathan Walsh, and Rachel Greenstadt . 2014. Privacy Detective: Detecting Private Information and Collective Privacy Behavior in a Large Social Network. In Workshop on Privacy in the Electronic Society. 35--46. Google ScholarDigital Library
Rich Caruana . 1997. Multitask learning. Machine learning Vol. 28, 1 (1997), 41--75. Google ScholarDigital Library
Chih-Chung Chang and Chih-Jen Lin . 2011. LIBSVM: A library for support vector machines. TIST Vol. 2, 3 (2011), 27. Google ScholarDigital Library
Zhiyong Cheng, Jialie Shen, and Steven C. H. Hoi . 2016. On Effective Personalized Music Retrieval by Exploring Online User Behaviors Proceedings of the International ACM SIGIR conference on Research and Development in Information Retrieval. 125--134. Google ScholarDigital Library
Zhiyong Cheng, Jialie Shen, Lei Zhu, Mohan S. Kankanhalli, and Liqiang Nie . 2017. Exploiting Music Play Sequence for Music Recommendation Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI. 3654--3660. Google ScholarDigital Library
Corinna Cortes and Vladimir Vapnik . 1995. Support-vector networks. Machine learning Vol. 20, 3 (1995), 273--297. Google ScholarDigital Library
Munmun De Choudhury, Scott Counts, and Eric Horvitz . 2013. Major life changes and behavioral markers in social media: case of childbirth Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 1431--1442. Google ScholarDigital Library
Valerian J Derlega and Alan L Chaikin . 1977. Privacy and self-disclosure in social relationships. Journal of Social Issues Vol. 33, 3 (1977), 102--115.Google ScholarCross Ref
Jianping Fan, Yuli Gao, and Hangzai Luo . 2007 a. Hierarchical classification for automatic image annotation The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 111--118. Google ScholarDigital Library
Jianping Fan, Yuli Gao, and Hangzai Luo . 2007 b. Hierarchical classification for automatic image annotation Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 111--118. Google ScholarDigital Library
Hongliang Fei, Ruoyi Jiang, Yuhao Yang, Bo Luo, and Jun Huan . 2011. Content based social behavior prediction: a multi-task learning approach The ACM International Conference on Information and Knowledge Management. ACM, 995--1000. Google ScholarDigital Library
Fuli Feng, Liqiang Nie, Xiang Wang, Richang Hong, and Tat-Seng Chua . 2017. Computational social indicators: a case study of chinese university ranking The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 455--464. Google ScholarDigital Library
Joseph L Fleiss, Jacob Cohen, and B. S Everitt . 1969. Large sample standard errors of kappa and weighted kappa. Psychological Bulletin Vol. 72, 5 (1969), 323--327.Google ScholarCross Ref
Yoav Freund, Robert E Schapire, et almbox. . 1996. Experiments with a new boosting algorithm. In International Conference on Machine Learning, Vol. Vol. 96. ACM, 148--156. Google ScholarDigital Library
Debasis Ganguly, Dwaipayan Roy, Mandar Mitra, and Gareth JF Jones . 2015. Word Embedding based Generalized Language Model for Information Retrieval The International ACM SIGIR Conference on Research and Development in Information Retrieval. 795--798. Google ScholarDigital Library
Shuguang Han, Daqing He, and Zhen Yue . 2014. Benchmarking the Privacy-Preserving People Search. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.Google Scholar
Xiangnan He and Tat-Seng Chua . 2017. Neural Factorization Machines for Sparse Predictive Analytics Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 355--364. Google ScholarDigital Library
Roger A Horn and Charles R Johnson . 1991. Topics in matrix analysis. Cambridge University Presss, Cambridge Vol. 37 (1991), 39. Google ScholarCross Ref
Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy . 2010. How much is too much? Privacy issues on Twitter. In Conference of International Communication Association, Singapore.Google Scholar
Lee Humphreys, Phillipa Gill, and Balachander Krishnamurthy . 2014. Twitter: a content analysis of personal information. Information, Communication & Society Vol. 17, 7 (2014), 843--857.Google ScholarCross Ref
Melinda L Korzaan and Katherine T Boswell . 2008. The influence of personality traits and information privacy concerns on behavioral intentions. Journal of Computer Information Systems Vol. 48, 4 (2008), 15--24.Google Scholar
Abhishek Kumar and Hal Daumé III . 2012. Learning Task Grouping and Overlap in Multi-task Learning International Conference on Machine Learning. 1383--1390. Google ScholarDigital Library
J Richard Landis and Gary G Koch . 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159--174.Google Scholar
Kun Liu and Evimaria Terzi . 2010. A framework for computing the privacy scores of users in online social networks. ACM Transactions on Knowledge Discovery from Data Vol. 5, 1 (2010), 6. Google ScholarDigital Library
Huina Mao, Xin Shuai, and Apu Kapadia . 2011. Loose tweets: an analysis of privacy leaks on twitter Workshop on Privacy in the Electronic Society. ACM, 1--12. Google ScholarDigital Library
Frank McSherry and Ilya Mironov . 2009. Differentially private recommender systems: building privacy into the net The International ACN SIGKDD Conferences on Knowledge Discovery and Data Mining. 627--636. Google ScholarDigital Library
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013. Distributed representations of words and phrases and their compositionality NIPS. 3111--3119. Google ScholarDigital Library
Tom M Mitchell . 1997. Machine learning. Burr Ridge, IL: McGraw Hill (1997). Google ScholarDigital Library
Sandra Petronio . 2012. Boundaries of privacy: Dialectics of disclosure. Suny Press.Google Scholar
Lee Rainie, Sara Kiesler, Ruogu Kang, Mary Madden, Maeve Duggan, Stephanie Brown, and Laura Dabbish . 2013. Anonymity, privacy, and security online. Pew Research Center (2013).Google Scholar
Manya Sleeper, Justin Cranshaw, Patrick Gage Kelley, Blase Ur, Alessandro Acquisti, Lorrie Faith Cranor, and Norman Sadeh . 2013. I read my Twitter the next morning and was astonished: A conversational perspective on Twitter regrets. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3277--3286. Google ScholarDigital Library
Xuemeng Song, Zhaoyan Ming, Liqiang Nie, Yi-Liang Zhao, and Tat-Seng Chua . 2016. Volunteerism Tendency Prediction via Harvesting Multiple Social Networks. ACM Transactions on Information System Vol. 34, 2 (2016), 10:1--10:27. Google ScholarDigital Library
Xuemeng Song, Liqiang Nie, Luming Zhang, Mohammad Akbari, and Tat-Seng Chua . 2015 a. Multiple social network learning and its application in volunteerism tendency prediction. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 213--222. Google ScholarDigital Library
Xuemeng Song, Liqiang Nie, Luming Zhang, Maofu Liu, and Tat-Seng Chua . 2015 b. Interest inference via structure-constrained multi-source multi-task learning International Joint Conference on Artificial Intelligence. AAAI Press, 2371--2377. Google ScholarDigital Library
Yi Song, Daniel Dahlmeier, and Stephane Bressan . 2014. Not So Unique in the Crowd: a Simple and Effective Algorithm for Anonymizing Location Data The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 19.Google Scholar
Damiano Spina, Julio Gonzalo, and Enrique Amigó . 2014. Learning similarity functions for topic detection in online reputation monitoring The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 527--536. Google ScholarDigital Library
Robert Tibshirani . 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996), 267--288.Google Scholar
Asimina Vasalou, Alastair J Gill, Fadhila Mazanderani, Chrysanthi Papoutsi, and Adam Joinson . 2011. Privacy dictionary: A new resource for the automated content analysis of privacy. JASIST Vol. 62, 11 (2011), 2095--2105. Google ScholarDigital Library
Yulu Wang, Garrick Sherman, Jimmy Lin, and Miles Efron . 2015. Assessor Differences and User Preferences in Tweet Timeline Generation International ACM SIGIR Conference on Research and Development in Information Retrieval. 615--624. Google ScholarDigital Library
Simon S Woo and Harsha Manjunatha . 2015. Empirical Data Analysis on User Privacy and Sentiment in Personal Blogs The International ACM SIGIR Conference on Research and Development in Information Retrieval.Google Scholar
Sicong Zhang, Hui Yang, and Lisa Singh . 2014. Increased Information Leakage from Text. In The International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 41--42.Google Scholar

Index Terms

A Personal Privacy Preserving Framework: I Let You Know Who Can See What
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
2. Security and privacy
  1. Human and societal aspects of security and privacy
    1. Privacy protections

Recommendations

Privacy-preserving topic model for tagging recommender systems

Tagging recommender systems provide users the freedom to explore tags and obtain recommendations. The releasing and sharing of these tagging datasets will accelerate both commercial and research work on recommender systems. However, releasing the ...
Read More
Privacy preserving of trust management credentials based on trusted computing
ISPEC'10: Proceedings of the 6th international conference on Information Security Practice and Experience

Privacy disclosure of forward direction credentials and backward direction credentials is an important security defect in existing trust management systems. In this paper, a novel distributed privacy preserving scheme for trust management credentials is ...
Read More
A Review on Privacy-Preserving Data Mining
CIT '14: Proceedings of the 2014 IEEE International Conference on Computer and Information Technology

Data mining has been widely studied and applied into many fields such as Internet of Things (IoT) and business development. However, data mining techniques also occur serious challenges due to increased sensitive information disclosure and privacy ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
June 2018
1509 pages
ISBN:9781450356572
DOI:10.1145/3209978
General Chairs:
Kevyn Collins-Thompson
University of Michigan, United States
,
Qiaozhu Mei
University of Michigan, United States
,
Program Chairs:
Brian Davison
Lehigh University, United States
,
Yiqun Liu
Tsinghua University, China
,
Emine Yilmaz
University College London, United Kingdom
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 June 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
boundary regulation
privacy preserving
social media.
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '18 Paper Acceptance Rate86of409submissions,21%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 514
  Total Downloads
- Downloads (Last 12 months)58
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Personal Privacy Preserving Framework: I Let You Know Who Can See What

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Privacy-preserving topic model for tagging recommender systems

Privacy preserving of trust management credentials based on trusted computing

A Review on Privacy-Preserving Data Mining