ABSTRACT
Recommenders are becoming one of the main ways to navigate the Internet. They recommend appropriate items to users based on their clicks, i.e., likes, ratings, purchases, etc. These clicks are key to providing relevant recommendations and, in this sense, have a significant utility. Since clicks reflect the preferences of users, they also raise privacy concerns. At first glance, there seems to be an inherent trade-off between the utility and privacy effects of a click. Nevertheless, a closer look reveals that the situation is more subtle: some clicks do improve utility without compromising privacy, whereas others decrease utility while hampering privacy.
In this paper, for the first time, we propose a way to quantify the exact utility and privacy effects of each user click. More specically, we show how to compute the privacy effect (disclosure risk) of a click using an information-theoretic approach, as well as its utility, using a commonality-based approach. We determine precisely when utility and privacy are antagonist and when they are not. To illustrate our metrics, we apply them to recommendation traces from Movielens and Jester datasets. We show, for instance, that, considering the Movielens dataset, 5.94% of the clicks improve the recommender utility without loss of privacy, whereas 16.43% of the clicks induce a high privacy risk without any utility gain.
An appealing application of our metrics is what we call a click-advisor, a visual user-aware clicking platform that helps users decide whether it is actually worth clicking on an item or not (after evaluating its potential utility and privacy effects using our techniques). Using a game-theoretic approach, we evaluate several user clicking strategies. We highlight in particular what we define as a smart strategy, leading to a Nash equilibrium, where every user reaches the maximum possible privacy while preserving the average overall recommender utility for all users (with respect to the case where user clicks are based solely on their genuine preferences, i.e., without consulting the click-advisor).
- The utility and privacy effects of a click (technical report), 2017. http://go.epfl.ch/clickadvisor-technical-report.Google Scholar
- M. 100K. Movielens dataset, 2003.Google Scholar
- M. Alfalayleh and L. Brankovic. Quantifying privacy: A novel entropy-based measure of disclosure risk. In IWOCA, 2015. Google ScholarCross Ref
- S. Athey. Monotone comparative statics under uncertainty. QJE, 2002.Google ScholarCross Ref
- J. A. Biega, K. P. Gummadi, I. Mele, D. Milchevski, C. Tryfonopoulos, and G. Weikum. R-susceptibility: An ir-centric approach to assessing privacy risks for users in online communities. In SIGIR, 2016.Google ScholarDigital Library
- J. A. Cal, A. Kilzer, A. Narayanan, E. W. Felten, and V. Shmatikov. You might also like: Privacy risks of collaborative filtering. In S&P, 2011.Google Scholar
- J. Canny. Collaborative filtering with privacy. In S&P, 2002. Google ScholarCross Ref
- F. Casinoa, J. Domingo-Ferrerb, C. Patsakisc, D. Puigb, and A. Solanasa. A kanonymous approach to privacy preserving collaborative filtering. JCSS, 2015.Google Scholar
- D. Chaum. The dining cryptographers problem: Unconditional sender and recipient untraceability. JCRYPTOL, 1988. Google ScholarCross Ref
- P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. RecSys, 2010.Google ScholarDigital Library
- C. Diaz, S. Seys, J. Claessens, and B. Preneel. Towards measuring anonymity. In PET, 2003. Google ScholarCross Ref
- C. Dwork. Differential privacy. In ICALP, 2006. Google ScholarDigital Library
- M. Ge, C. Delgado-Battenfeld, and D. Jannach. Beyond accuracy: evaluating recommender systems by coverage and serendipity. RecSys, 2010. Google ScholarDigital Library
- D. L. George T. Duncan. Disclosure-limited data dissemination. JASA, 1986.Google ScholarCross Ref
- R. M. Gray. Entropy and information theory. Springer, 2011. Google ScholarCross Ref
- R. Guerraoui, A. Kermarrec, R. Patra, and M. Taziki. D2p: distance-based differential privacy in recommenders. VLDB, 2015.Google ScholarDigital Library
- Z. Huang, W. Du, and B. Chen. Deriving private information from randomized data. In SIGMOD, 2005. Google ScholarDigital Library
- M. Z. Islam, P. Barnaghi, and L. Brankovic. Measuring data quality: Predictive accuracy vs. similarity of decision trees. In ICCIT, 2003.Google Scholar
- Jester. Online joke recommender system, 2001.Google Scholar
- Z. Jorgensen, T. Yu, and G. Cormode. Conservative or liberal? personalized differential privacy. In ICDE, 2015.Google ScholarCross Ref
- H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. In ICDM, 2003. Google ScholarCross Ref
- S. Katzenbeisser and M. Petkovic. Privacy-preserving recommendation systems for consumer healthcare services. ARES, 2008. Google ScholarDigital Library
- J. A. Konstan and J. Riedl. Recommender systems: from algorithms to user experience. UMUAI, 2012.Google ScholarDigital Library
- Y. Koren. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In KDD, 2008. Google ScholarDigital Library
- Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 2009. Google ScholarDigital Library
- D. kumar Bokde, S. Girase, and D. Mukhopadhyay. Role of matrix factorization model in collaborative filtering algorithm: A survey. IJAFRC, 2015.Google Scholar
- D. Lambert. Measures of disclosure risk and harm. JOS, 1993.Google Scholar
- S. Li, A. Karatzoglou, and C. Gentile. Collaborative filtering bandits. In SIGIR, 2016. Google ScholarDigital Library
- L. Sankar, S. Rajagopalan, and H. Poor. Utility-privacy tradeoffs in databases: An information-theoretic approach. IFS, 2013.Google Scholar
- Y. Luo, J. Le, and H. Chen. A privacy-preserving book recommendation model based on multi-agent. WCSE, 2009. Google ScholarDigital Library
- F. McSherry and I. Mironov. Differentially private recommender systems: Building privacy into the netflix prize contenders. In KDD, 2009.Google ScholarDigital Library
- Y. K. P. Cremonesi and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. RecSys, 2010. Google ScholarDigital Library
- H. Polat and W. Du. Privacy-preserving collaborative filtering using randomized perturbation techniques. ICDM, 2003. Google ScholarCross Ref
- N. Ramakrishnan, B. J. Keller, B. J. Mirza, A. Y. Grama, and G. Karypis. Privacy risks in recommender systems. Internet Computing, 2001.Google ScholarDigital Library
- A. M. Rashid, I. Albert, D. Cosley, S. K. Lam, S. M. McNee, J. A. Konstan, and J. Riedl. Getting to know you: learning new user preferences in recommender systems. In IUI, pages 127--134, 2002. Google ScholarDigital Library
- B. M. Sarwar, G. Karypis, J. A. Konstan, and J. T. Riedl. Application of dimensionality reduction in recommender system - a case study. In ACM WEBKDD Workshop, 2000.Google ScholarCross Ref
- A. Serjantov and G. Danezis. Towards an information theoretic metric for anonymity. In PET, 2002.Google ScholarDigital Library
- X. Su and T. M. Khoshgoftaar. A survey of collaborative filtering techniques. AAI, 2009.Google ScholarDigital Library
- L. Sweeney. K-anonymity: a model for protecting privacy. IJUFKS, 2002.Google Scholar
- E. Tuncel, P. Koulgi, and K. Rose. Rate-distortion approach to databases: storage and content-based retrieval. IT, 2004.Google Scholar
- J. Canny. Collaborative filtering with privacy via factor analysis. In SIGIR, 2002. Google ScholarDigital Library
Index Terms
- The Utility and Privacy Effects of a Click
Recommendations
The cost of privacy: destruction of data-mining utility in anonymized data publishing
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningRe-identification is a major privacy threat to public datasets containing individual records. Many privacy protection algorithms rely on generalization and suppression of "quasi-identifier" attributes such as ZIP code and birthdate. Their objective is ...
Mining Privacy Settings to Find Optimal Privacy-Utility Tradeoffs for Social Network Services
SOCIALCOM-PASSAT '12: Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEE International Conference on Privacy, Security, Risk and TrustPrivacy has been a big concern for users of social network services (SNS). On recent criticism about privacy protection, most SNS now provide fine privacy controls, allowing users to set visibility levels for almost every profile item. However, this ...
An information theoretic privacy and utility measure for data sanitization mechanisms
CODASPY '12: Proceedings of the second ACM conference on Data and Application Security and PrivacyData collection agencies publish sensitive data for legitimate purposes, such as research, marketing and etc. Data publishing has attracted much interest in research community due to the important concerns over the protection of individuals privacy. As ...
Comments