ABSTRACT
This paper presents a novel technique, anatomy, for publishing sensitive data. Anatomy releases all the quasi-identifier and sensitive values directly in two separate tables. Combined with a grouping mechanism, this approach protects privacy, and captures a large amount of correlation in the microdata. We develop a linear-time algorithm for computing anatomized tables that obey the l-diversity privacy requirement, and minimize the error of reconstructing the microdata. Extensive experiments confirm that our technique allows significantly more effective data analysis than the conventional publication method based on generalization. Specifically, anatomy permits aggregate reasoning with average error below 10%, which is lower than the error obtained from a generalized table by orders of magnitude.
- {1} C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, pages 901-909, 2005. Google ScholarDigital Library
- {2} G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Anonymizing tables. In ICDT, pages 246-258, 2005. Google ScholarDigital Library
- {3} G. Arfken and H. Weber. Mathematical Methods for Physicists. Academic Press, 1995.Google Scholar
- {4} R. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE, pages 217-228, 2005. Google ScholarDigital Library
- {5} B. C. M. Fung, K. Wang, and P. S. Yu. Top-down specialization for information and privacy preservation. In ICDE, pages 205-216, 2005. Google ScholarDigital Library
- {6} V. Iyengar. Transforming data to satisfy privacy constraints. In SIGKDD, pages 279-288, 2002. Google ScholarDigital Library
- {7} D. Kifer and J. E. Gehrke. Injecting utility into anonymized datasets. To appear in SIGMOD 2006. Google ScholarDigital Library
- {8} K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, pages 49-60, 2005. Google ScholarDigital Library
- {9} K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In ICDE, 2006. Google ScholarDigital Library
- {10} A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006. Google ScholarDigital Library
- {11} A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In PODS, pages 223-228, 2004. Google ScholarDigital Library
- {12} P. Samarati. Protecting respondents' identities in microdata release. TKDE, 13(6):1010-1027, 2001. Google ScholarDigital Library
- {13} P. Samarati and L. Sweeney. Generalizing data to provide anonymity when disclosing information. In PODS, page 188, 1998. Google ScholarDigital Library
- {14} L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness, and Knowlege-Based Systems, 10(5):557-570, 2002. Google ScholarDigital Library
- {15} N. Thaper, S. Guha, P. Indyk, and N. Koudas. Dynamic multidimensional histograms. In SIGMOD, pages 428-439, 2002. Google ScholarDigital Library
- {16} K. Wang, P. S. Yu, and S. Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. In ICDM, pages 249-256, 2004. Google ScholarDigital Library
- {17} X. Xiao and Y. Tao. Personalized privacy preservation. To appear in SIGMOD, 2006. Google ScholarDigital Library
- {18} C. Yao, X. S. Wang, and S. Jajodia. Checking for k-anonymity violation by views. In VLDB. Google ScholarDigital Library
Index Terms
- Anatomy: simple and effective privacy preservation
Recommendations
Achieving P-Sensitive K-Anonymity via Anatomy
ICEBE '09: Proceedings of the 2009 IEEE International Conference on e-Business EngineeringPrivacy-preserving data publishing is to protect sensitive information of individuals in published data while the distortion ratio of the data is minimized. One well-studied approach is the $k$-anonymity model. Recently, several authors have recognized ...
Local anatomy for personalised privacy protection
Anonymisation technique has been extensively studied and widely applied for privacy-preserving data publishing. However, most existing methods ignore personal anonymity requirements. In these approaches, the microdata consist of three categories of ...
Privacy preserving dynamic data release based on non-synonymous diverse anatomy
The publishing and using of big data brought unprecedented convenience to users. However, it also results in the disclosure of personal privacy information. In order to mitigate the privacy leakage risk of sensitive information during dynamic data ...
Comments