research-article

GUPT: privacy preserving data analysis made easy

Authors:
Prashanth Mohan

UC Berkeley, Berkeley, CA, USA

UC Berkeley, Berkeley, CA, USA
View Profile

,
Abhradeep Thakurta

Pennsylvania State University, State College, PA, USA

Pennsylvania State University, State College, PA, USA
View Profile

,
Elaine Shi

UC Berkeley, Berkeley, CA, USA

UC Berkeley, Berkeley, CA, USA
View Profile

,
Dawn Song

UC Berkeley, Berkeley, CA, USA

UC Berkeley, Berkeley, CA, USA
View Profile

,
David Culler

UC Berkeley, Berkeley, CA, USA

UC Berkeley, Berkeley, CA, USA
View Profile

SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of DataMay 2012Pages 349–360https://doi.org/10.1145/2213836.2213876

Published:20 May 2012Publication History

SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data

Pages 349–360

ABSTRACT

It is often highly valuable for organizations to have their data analyzed by external agents. However, any program that computes on potentially sensitive data risks leaking information through its output. Differential privacy provides a theoretical framework for processing data while protecting the privacy of individual records in a dataset. Unfortunately, it has seen limited adoption because of the loss in output accuracy, the difficulty in making programs differentially private, lack of mechanisms to describe the privacy budget in a programmer's utilitarian terms, and the challenging requirement that data owners and data analysts manually distribute the limited privacy budget between queries.

This paper presents the design and evaluation of a new system, GUPT, that overcomes these challenges. Unlike existing differentially private systems such as PINQ and Airavat, it guarantees differential privacy to programs not developed with privacy in mind, makes no trust assumptions about the analysis program, and is secure to all known classes of side-channel attacks.

GUPT uses a new model of data sensitivity that degrades privacy of data over time. This enables efficient allocation of different levels of privacy for different user applications while guaranteeing an overall constant level of privacy and maximizing the utility of each application. GUPT also introduces techniques that improve the accuracy of output while achieving the same level of privacy. These approaches enable GUPT to easily execute a wide variety of data analysis programs while providing both utility and privacy.

References

N. Anciaux, L. Bouganim, H. H. van, P. Pucheral, and P. M. Apers. Data degradation: Making private data less sensitive over time. In CIKM, 2008. Google ScholarDigital Library
F. Bancilhon and R. Ramakrishnan. An amateur's introduction to recursive query processing strategies. In SIGMOD, 1986. Google ScholarDigital Library
M. Barbaro and T. Zeller. A face is exposed for aol searcher no. 4417749. The New York Times, Aug. 2006.Google Scholar
J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. Operating Systems Design and Implementation, October 2004. Google ScholarDigital Library
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, 2006. Google ScholarDigital Library
C. Dwork, M. Naor, T. Pitassi, and G. N. Rothblum. Differential privacy under continual observation. In STOC, 2010. Google ScholarDigital Library
A. Frank and A. Asuncion. UCI machine learning repository, 2010.Google Scholar
S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. In KDD, 2008. Google ScholarDigital Library
A. Ghosh and A. Roth. Selling privacy at auction. http://arxiv.org/abs/1011.1375. Google ScholarDigital Library
A. Haeberlen, B. C. Pierce, and A. Narayan. Differential privacy under fire. In USENIX Security, 2011. Google ScholarDigital Library
M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially private histograms through consistency. Proc. VLDB Endow., 3:1021--1032, September 2010. Google ScholarDigital Library
V. Karwa, S. Raskhodnikova, A. Smith, and G. Yaroslavtsev. Private analysis of graph structure. In VLDB, 2011.Google ScholarDigital Library
D. Kifer. Attacks on privacy and definetti's theorem. In SIGMOD, 2009. Google ScholarDigital Library
C. Li, M. Hay, V. Rastogi, G. Miklau, and A. McGregor. Optimizing linear counting queries under differential privacy. In PODS, 2010. Google ScholarDigital Library
A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006. Google ScholarDigital Library
F. McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In SIGMOD, 2009. Google ScholarDigital Library
F. McSherry and K. Talwar. Mechanism design via differential privacy. In FOCS, 2007. Google ScholarDigital Library
A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, 2008. Google ScholarDigital Library
K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC, 2007. Google ScholarDigital Library
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report 1999--66, Stanford InfoLab, 1999.Google Scholar
V. Rastogi and S. Nath. Differentially private aggregation of distributed time-series with transformation and encryption. In SIGMOD, 2010. Google ScholarDigital Library
I. Roy, S. T. V. Setty, A. Kilzer, V. Shmatikov, and E. Witchel. Airavat: security and privacy for mapreduce. In NSDI, 2010. Google ScholarDigital Library
A. Serjantov and G. Danezis. Towards an information theoretic metric for anonymity. In PET, 2002. Google ScholarDigital Library
A. Smith. Privacy-preserving statistical estimation with optimal convergence rates. In STOC, 2011. Google ScholarDigital Library
L. Sweeney. k-anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002. Google ScholarDigital Library
H. H. van, M. Fokkinga, and N. Anciaux. A framework to balance privacy and data usability using data degradation. In CSE, 2009.Google Scholar
C. Wright, C. Cowan, S. Smalley, J. Morris, and G. Kroah-Hartman. Linux security modules: General security support for the linux kernel. In USENIX Security, 2002. Google ScholarDigital Library
X. Xiao, G. Bender, M. Hay, and J. Gehrke. ireduct: differential privacy withGoogle Scholar

Index Terms

GUPT: privacy preserving data analysis made easy

Recommendations

Efficient privacy-aware search over encrypted databases
CODASPY '14: Proceedings of the 4th ACM conference on Data and application security and privacy

In recent years, database as a service (DAS) model where data management is outsourced to cloud service providers has become more prevalent. Although DAS model offers lower cost and flexibility, it necessitates the transfer of potentially sensitive data ...
Read More
Differentially private data release for data mining
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Privacy-preserving data publishing addresses the problem of disclosing sensitive data when mining for useful information. Among the existing privacy models, ∈-differential privacy provides one of the strongest privacy guarantees and has no assumptions ...
Read More
IMR based Anonymization for Privacy Preservation in Data Mining
KMO '16: Proceedings of the The 11th International Knowledge Management in Organizations Conference on The changing face of Knowledge Management Impacting Society

Privacy Preserving Data Mining (PPDM) is a data mining research area that aims to protect individual's personal information from unsolicited or unauthorized disclosure. Privacy relates to personal information that a person would not wish others to know ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
May 2012
886 pages
ISBN:9781450312479
DOI:10.1145/2213836
General Chairs:
K. Selçuk Candan
Arizona State University
,
Yi Chen
Arizona State University
,
Richard Snodgrass
University of Arizona
,
Program Chair:
Luis Gravano
Columbia University
,
Publications Chair:
Ariel Fuxman
Microsoft Research
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 May 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
algorithms
data mining
differential privacy
security
Qualifiers
- research-article
Conference

Acceptance Rates
SIGMOD '12 Paper Acceptance Rate48of289submissions,17%Overall Acceptance Rate785of4,003submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 169
  Total Citations
  View Citations
- 1,564
  Total Downloads
- Downloads (Last 12 months)60
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

GUPT: privacy preserving data analysis made easy

SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient privacy-aware search over encrypted databases

Differentially private data release for data mining

IMR based Anonymization for Privacy Preservation in Data Mining

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

GUPT: privacy preserving data analysis made easy

SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient privacy-aware search over encrypted databases

Differentially private data release for data mining

IMR based Anonymization for Privacy Preservation in Data Mining

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media