Article

Approximate algorithms for K-anonymity

Authors:
Hyoungmin Park

Seoul National University, Seoul, South Korea

Seoul National University, Seoul, South Korea
View Profile

,
Kyuseok Shim

Seoul National University, Seoul, South Korea

Seoul National University, Seoul, South Korea
View Profile

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataJune 2007Pages 67–78https://doi.org/10.1145/1247480.1247490

Published:11 June 2007Publication History

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

Pages 67–78

ABSTRACT

When a table containing individual data is published, disclosure of sensitive information should be prohibitive. A naive approach for the problem is to remove identifiers such as name and social security number. However, linking attacks which joins the published table with other tables on some attributes, called quasi-identifier, may reveal the sensitive information. To protect privacy against linking attack, the notion of k-anonymity which makes each record in the table be indistinguishable with k-1 other records has been proposed previously. It is shown to be NP-Hard to k-anonymize a table minimizing the number of suppressed cells. To alleviate this, O(k log k)-approximation and O(k)-approximation algorithms were proposed in previous works.

In this paper, we propose several approximation algorithms that guarantee O(log k)-approximation ratio and perform significantly better than the traditional algorithms. We also provide O(ß log k)-approximate algorithms which gracefully adjust their running time according to the tolerance é (≥ 1) of the approximation ratios. Experimental results confirm that our approximation algorithms perform significantly better than traditional approximation algorithms.

References

C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, pages 901--909, 2005. Google ScholarDigital Library
G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu. Achieving anonymity via clustering. In VLDB, 2006.Google ScholarDigital Library
G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, pages 246--258, 2005. Google ScholarDigital Library
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, pages 487--499, 1994. Google ScholarDigital Library
R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE, pages 217--228, 2005. Google ScholarDigital Library
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms (Second Edition). McGraw Hill and MIT Press, 2001.Google Scholar
D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9:256--278, 1974.Google ScholarDigital Library
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD Conference, pages 1--12, 2000. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, pages 49--60, 2005. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In ICDE, page 25, 2006. Google ScholarDigital Library
A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006. Google ScholarDigital Library
A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In Proc. of PODS, 2004. Google ScholarDigital Library
N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. In ICDT, pages 398--416, 1999. Google ScholarDigital Library
U.C. Irvine Machine Learning Repository. http://www.ics.uci.edu/~mlearn/mlsummary.html.Google Scholar
P. Samarati and L. Sweeny. Generalizing data to provide anonymity when disclosing information (abstract). In In Proc. of ACM Symposium on Principles of Database Systems, page 188, 1998. Google ScholarDigital Library
L. Sweeny. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowedge-based Systems, 10(5):557--570, 2002. Google ScholarDigital Library
L. Willenborg and T. deWaal. Elements of statistical disclosure control. Springer Verlog Lecture Notes in Statistics, 2000.Google Scholar

Index Terms

Approximate algorithms for K-anonymity

Recommendations

(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

Privacy preservation is an important issue in the release of data for mining purposes. The k-anonymity model has been introduced for protecting individual identification. Recent studies show that a more sophisticated model is necessary to protect the ...
Read More
Approximate algorithms with generalizing attribute values for k-anonymity

When a table containing individual data is published, disclosure of sensitive information should be prohibitive. Since simply removing identifiers such as name and social security number may reveal the sensitive information by linking attacks which ...
Read More
(α, k)-anonymous data publishing

Privacy preservation is an important issue in the release of data for mining purposes. The k-anonymity model has been introduced for protecting individual identification. Recent studies show that a more sophisticated model is necessary to protect the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
June 2007
1210 pages
ISBN:9781595936868
DOI:10.1145/1247480
General Chairs:
Lizhu Zhou
Tsinghua University, China
,
Tok Wang Ling
National University of Singapore, Singapore
,
Program Chair:
Beng Chin Ooi
National University of Singapore, Singapore
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 June 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anonymity
data mining
data publishing
local recoding
privacy preservation
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 103
  Total Citations
  View Citations
- 182
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Approximate algorithms for K-anonymity

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Approximate algorithms with generalizing attribute values for k-anonymity

(α, k)-anonymous data publishing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Approximate algorithms for K-anonymity

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Approximate algorithms with generalizing attribute values for k-anonymity

(α, k)-anonymous data publishing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media