skip to main content
10.1145/1401890.1401926acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Composition attacks and auxiliary information in data privacy

Published:24 August 2008Publication History

ABSTRACT

Privacy is an increasingly important aspect of data publishing. Reasoning about privacy, however, is fraught with pitfalls. One of the most significant is the auxiliary information (also called external knowledge, background knowledge, or side information) that an adversary gleans from other channels such as the web, public records, or domain knowledge. This paper explores how one can reason about privacy in the face of rich, realistic sources of auxiliary information. Specifically, we investigate the effectiveness of current anonymization schemes in preserving privacy when multiple organizations independently release anonymized data about overlapping populations.

1. We investigate composition attacks, in which an adversary uses independent anonymized releases to breach privacy. We explain why recently proposed models of limited auxiliary information fail to capture composition attacks. Our experiments demonstrate that even a simple instance of a composition attack can breach privacy in practice, for a large class of currently proposed techniques. The class includes k-anonymity and several recent variants.

2. On a more positive note, certain randomization-based notions of privacy (such as differential privacy) provably resist composition attacks and, in fact, the use of arbitrary side information.This resistance enables "stand-alone" design of anonymization schemes, without the need for explicitly keeping track of other releases.

We provide a precise formulation of this property, and prove that an important class of relaxations of differential privacy also satisfy the property. This significantly enlarges the class of protocols known to enable modular design.

References

  1. J. Official Statistics, special issue on disclosure limitation methods, Vol. 14, No. 4, 1998.Google ScholarGoogle Scholar
  2. R. Agrawal and R. Srikant. Privacy-preserving data mining. In SIGMOD, pages 439--450. ACM Press, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Agrawal and J. R. Haritsa. A framework for high-accuracy privacy-preserving mining. In ICDE, pages 193--204. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In PODS, pages 273--282. ACM Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Barbaro and T. Zeller. A face is exposed for AOL searcher no. 4417749. The New York Times, Aug. 2006.Google ScholarGoogle Scholar
  6. A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: The SuLQ framework. In PODS, pages 128--138. ACM Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In STOC, pages 609--618. ACM Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J.-W. Byun, Y. Sohn, E. Bertino, and N. Li. Secure anonymization for incremental datasets. In Secure Data Management, pages 48--63. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Chaudhuri and N. Mishra. When random sampling preserves privacy. In CRYPTO, pages 198--213. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B.-C. Chen, R. Ramakrishnan, and K. LeFevre. Privacy skyline: Privacy with multidimensional adversarial knowledge. In VLDB, pages 770--781. VLDB Endowment, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS, pages 202--210. ACM Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Domingo-Ferrer and J. M. Mateo-Sanz. Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering, 14(1):189--201, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Dwork. Differential privacy. In ICALP, pages 1--12. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, pages 486--503. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, pages 486--503. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Dwork and K. Nissim. Privacy-preserving datamining on vertically partitioned databases. In CRYPTO, pages 528--544. Springer, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  17. A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy preserving mining of association rules. In KDD, pages 217--228. ACM Press, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. V. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS, pages 211--222. ACM Press, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. C. M. Fung, K. Wang, A. W.-C. Fu, and J. Pei. Anonymity for continuous data publishing. In EDBT, pages 264--275. ACM Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. CoRR, arXiv:0803.0032v1 {cs.DB}, 2008.Google ScholarGoogle Scholar
  21. S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Sciences, 28(2):270--299, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  22. S. P. Kasiviswanathan and A. Smith. A note on differential privacy: Defining resistance to arbitrary side information. CoRR, arXiv:0803.39461 {cs.CR}, 2008.Google ScholarGoogle Scholar
  23. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In ICDE, page 25. IEEE Computer Society, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Workload-aware anonymization. In KDD, pages 277--286. ACM Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE, pages 106--115. IEEE Computer Society, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  26. Y. Lindell. Composition of Secure Multi-Party Protocols: A Comprehensive Study. Springer-Verlag, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber. Privacy: From theory to practice on the map. In ICDE, pages 277--286. IEEE Computer Society, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 1(1), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. J. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke, and J. Y. Halpern. Worst-case background knowledge for privacy-preserving data publishing. In ICDE, pages 126--135. IEEE Computer Society, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  30. F. McSherry and K. Talwar. Differential privacy in mechanism design. In FOCS, pages 94--103. IEEE Computer Society, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC, pages 75--84. ACM Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Pei, J. Xu, Z. Wang, W. Wang, and K. Wang. Maintaining k-anonymity against incremental updates. In SSDBM, page 5, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. L. Sweeney. k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557--570, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. UCI machine learning repository, http://www.ics.uci.edu/ mlearn/databases/.Google ScholarGoogle Scholar
  35. A. van den Hout and P. van der Heijden. Randomized response, statistical disclosure control and misclassification: A review. International Statistical Review, 70:269--288, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  36. K. Wang and B. C. M. Fung. Anonymizing sequential releases. In KDD, pages 414--423. ACM Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63--69, 1965.Google ScholarGoogle ScholarCross RefCross Ref
  38. R. C.-W. Wong, A. W.-C. Fu, K. Wang, and J. Pei. Minimality attack in privacy preserving data publishing. In VLDB, pages 543--554. VLDB Endowment, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. X. Xiao and Y. Tao. Anatomy: Simple and effective privacy preservation. In VLDB, pages 139--150, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. X. Xiao and Y. Tao. Personalized Privacy Preservation In SIGMOD, pages 229--240, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. X. Xiao and Y. Tao. M-invariance: towards privacy preserving re-publication of dynamic datasets. In SIGMOD, pages 689--700. ACM Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. C. Yao, X. S. Wang, and S. Jajodia. Checking for k-anonymity violation by views. In VLDB, p. 910--921, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Composition attacks and auxiliary information in data privacy

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2008
      1116 pages
      ISBN:9781605581934
      DOI:10.1145/1401890
      • General Chair:
      • Ying Li,
      • Program Chairs:
      • Bing Liu,
      • Sunita Sarawagi

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 August 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '08 Paper Acceptance Rate118of593submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader