skip to main content
10.1145/1247480.1247490acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Approximate algorithms for K-anonymity

Published:11 June 2007Publication History

ABSTRACT

When a table containing individual data is published, disclosure of sensitive information should be prohibitive. A naive approach for the problem is to remove identifiers such as name and social security number. However, linking attacks which joins the published table with other tables on some attributes, called quasi-identifier, may reveal the sensitive information. To protect privacy against linking attack, the notion of k-anonymity which makes each record in the table be indistinguishable with k-1 other records has been proposed previously. It is shown to be NP-Hard to k-anonymize a table minimizing the number of suppressed cells. To alleviate this, O(k log k)-approximation and O(k)-approximation algorithms were proposed in previous works.

In this paper, we propose several approximation algorithms that guarantee O(log k)-approximation ratio and perform significantly better than the traditional algorithms. We also provide O(ß log k)-approximate algorithms which gracefully adjust their running time according to the tolerance é (≥ 1) of the approximation ratios. Experimental results confirm that our approximation algorithms perform significantly better than traditional approximation algorithms.

References

  1. C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, pages 901--909, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu. Achieving anonymity via clustering. In VLDB, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, pages 246--258, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, pages 487--499, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE, pages 217--228, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms (Second Edition). McGraw Hill and MIT Press, 2001.Google ScholarGoogle Scholar
  7. D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9:256--278, 1974.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD Conference, pages 1--12, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, pages 49--60, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In ICDE, page 25, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In Proc. of PODS, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. In ICDT, pages 398--416, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. U.C. Irvine Machine Learning Repository. http://www.ics.uci.edu/~mlearn/mlsummary.html.Google ScholarGoogle Scholar
  15. P. Samarati and L. Sweeny. Generalizing data to provide anonymity when disclosing information (abstract). In In Proc. of ACM Symposium on Principles of Database Systems, page 188, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Sweeny. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowedge-based Systems, 10(5):557--570, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Willenborg and T. deWaal. Elements of statistical disclosure control. Springer Verlog Lecture Notes in Statistics, 2000.Google ScholarGoogle Scholar

Index Terms

  1. Approximate algorithms for K-anonymity

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
          June 2007
          1210 pages
          ISBN:9781595936868
          DOI:10.1145/1247480
          • General Chairs:
          • Lizhu Zhou,
          • Tok Wang Ling,
          • Program Chair:
          • Beng Chin Ooi

          Copyright © 2007 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 11 June 2007

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader