research-article

Open Access

A framework for adaptive differential privacy

Authors:
Daniel Winograd-Cort

University of Pennsylvania, USA

University of Pennsylvania, USA
View Profile

,
Andreas Haeberlen

University of Pennsylvania, USA

University of Pennsylvania, USA
View Profile

,
Aaron Roth

University of Pennsylvania, USA

University of Pennsylvania, USA
View Profile

,
Benjamin C. Pierce

University of Pennsylvania, USA

University of Pennsylvania, USA
View Profile

Proceedings of the ACM on Programming Languages Volume 1 Issue ICFPArticle No.: 10pp 1–29https://doi.org/10.1145/3110254

Published:29 August 2017Publication History

Proceedings of the ACM on Programming Languages

Abstract

Differential privacy is a widely studied theory for analyzing sensitive data with a strong privacy guarantee—any change in an individual's data can have only a small statistical effect on the result—and a growing number of programming languages now support differentially private data analysis. A common shortcoming of these languages is poor support for adaptivity. In practice, a data analyst rarely wants to run just one function over a sensitive database, nor even a predetermined sequence of functions with fixed privacy parameters; rather, she wants to engage in an interaction where, at each step, both the choice of the next function and its privacy parameters are informed by the results of prior functions. Existing languages support this scenario using a simple composition theorem, which often gives rather loose bounds on the actual privacy cost of composite functions, substantially reducing how much computation can be performed within a given privacy budget. The theory of differential privacy includes other theorems with much better bounds, but these have not yet been incorporated into programming languages.

We propose a novel framework for adaptive composition that is elegant, practical, and implementable. It consists of a reformulation based on typed functional programming of privacy filters, together with a concrete realization of this framework in the design and implementation of a new language, called Adaptive Fuzz. Adaptive Fuzz transplants the core static type system of Fuzz to the adaptive setting by wrapping the Fuzz typechecker and runtime system in an outer adaptive layer, allowing Fuzz programs to be conveniently constructed and typechecked on the fly. We describe an interpreter for Adaptive Fuzz and report results from two case studies demonstrating its effectiveness for implementing common statistical algorithms over real data sets.

Supplemental Material

Available for Download

zip

icfp17-main109-s.zip (164.6 KB)

Contained here are the source files for the Adaptive Fuzz language along with some simple libraries, examples, and tests. Adaptive Fuzz requires OCaml and is currently only supported on Ubuntu 16.04.2 LTS (but may work in other environments). After installing the language, the user will be able to run tests and examples, including the case studies described in the paper.

References

Rakesh Agrawal and Ramakrishnan Srikant. 2000. Privacy-preserving Data Mining. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD ’00). ACM, New York, NY, USA, 439–450. Google ScholarDigital Library
Marc Andrysco, David Kohlbrenner, Keaton Mowery, Ranjit Jhala, Sorin Lerner, and Hovav Shacham. 2015. On Subnormal Floating Point and Abnormal Timing. In Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP ’15). IEEE Computer Society, Washington, DC, USA, 623–639. Google ScholarDigital Library
Gilles Barthe, Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Aaron Roth, and Pierre-Yves Strub. 2015. Higher-order approximate relational refinement types for mechanism design and differential privacy. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, 55–68. Google ScholarDigital Library
Gilles Barthe, Boris Köpf, Federico Olmedo, and Santiago Zanella Béguelin. 2012. Probabilistic Relational Reasoning for Differential Privacy. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’12), Vol. 47. ACM, ACM, New York, NY, USA, 97–110. Google ScholarDigital Library
Raef Bassily, Adam Smith, and Abhradeep Thakurta. 2014. Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014. IEEE Computer Society, 464–473. Google ScholarDigital Library
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006a. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology-EUROCRYPT 2006. Springer, 486–503. Google ScholarDigital Library
Cynthia Dwork and Jing Lei. 2009. Differential privacy and robust statistics. In Proceedings of the forty-first annual ACM symposium on Theory of computing. ACM, 371–380. Google ScholarDigital Library
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006b. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography. Springer, 265–284. Google ScholarDigital Library
Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science 9, 3-4 (2014), 211–407.Google ScholarDigital Library
Cynthia Dwork, Guy N Rothblum, and Salil Vadhan. 2010. Boosting and differential privacy. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on. IEEE, 51–60.Google ScholarDigital Library
Hamid Ebadi and David Sands. 2015. Featherweight PINQ. CoRR abs/1505.02642 (2015). http://arxiv .org/abs/1505.02642Google Scholar
Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. 2002. Privacy Preserving Mining of Association Rules. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’02). ACM, New York, NY, USA, 217–228. Google ScholarDigital Library
Marco Gaboardi, Andreas Haeberlen, Justin Hsu, Arjun Narayan, and Benjamin C Pierce. 2013. Linear dependent types for differential privacy. In ACM SIGPLAN Notices, Vol. 48. ACM, 357–370.Google ScholarDigital Library
Marco Gaboardi, James Honaker, Gary King, Kobbi Nissim, Jonathan Ullman, and Salil Vadhan. Working Paper. PSI (Îĺ): a Private data Sharing Interface. In Theory and Practice of Differential Privacy. New York, NY.Google Scholar
Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, and Adam Smith. 2008. Composition Attacks and Auxiliary Information in Data Privacy. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’08). ACM, New York, NY, USA, 265–273. Google ScholarDigital Library
Andreas Haeberlen, Benjamin C. Pierce, and Arjun Narayan. 2011. Differential Privacy Under Fire. In Proceedings of the 20th USENIX Conference on Security (SEC’11). USENIX Association, Berkeley, CA, USA, 33–33. http://dl .acm.org/ citation .cfm?id=2028067.2028100Google ScholarDigital Library
J. Hsu, M. Gaboardi, A. Haeberlen, S. Khanna, A. Narayan, B. C. Pierce, and A. Roth. 2014. Differential Privacy: An Economic Method for Choosing Epsilon. In 2014 IEEE 27th Computer Security Foundations Symposium. 398–410. Google ScholarDigital Library
Aaron Johnson and Vitaly Shmatikov. 2013. Privacy-preserving Data Exploration in Genome-wide Association Studies. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’13). ACM, New York, NY, USA, 1079–1087. Google ScholarDigital Library
Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2015. The Composition Theorem for Differential Privacy. In Proceedings of The 32nd International Conference on Machine Learning. 1376–1385.Google Scholar
Shiva P Kasiviswanathan and Adam Smith. 2014. On the’Semantics’ of Differential Privacy: A Bayesian Formulation. Journal of Privacy and Confidentiality 6, 1 (2014), 1.Google ScholarCross Ref
Daniel Kifer. 2009. Attacks on Privacy and deFinetti’s Theorem. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD ’09). ACM, New York, NY, USA, 127–138. Google ScholarDigital Library
Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. 2007. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In 2007 IEEE 23rd International Conference on Data Engineering. 106–115. Google ScholarCross Ref
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. 2006. ℓ-Diversity: Privacy Beyond κ-Anonymity. In Proceedings of the 22Nd International Conference on Data Engineering (ICDE ’06). IEEE Computer Society, Washington, DC, USA, 24–. Google ScholarDigital Library
Frank McSherry and Ilya Mironov. 2009. Differentially Private Recommender Systems: Building Privacy into the Net. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’09). ACM, New York, NY, USA, 627–636. Google ScholarDigital Library
Frank D McSherry. 2009. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, 19–30.Google ScholarDigital Library
Elinor Mills. 2006. AOL sued over Web search data release. (Sept. 2006). cnet, http://www .cnet.com/news/aol-sued-overweb- search- data- release/ .Google Scholar
Ilya Mironov. 2012. On Significance of the Least Significant Bits for Differential Privacy. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS ’12). ACM, New York, NY, USA, 650–661. Google ScholarDigital Library
Prashanth Mohan, Abhradeep Thakurta, Elaine Shi, Dawn Song, and David Culler. 2012. GUPT: Privacy Preserving Data Analysis Made Easy. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD ’12). ACM, New York, NY, USA, 349–360. Google ScholarDigital Library
Jack Murtagh and Salil Vadhan. 2016. The Complexity of Computing the Optimal Composition of Differential Privacy. In Theory of Cryptography. Springer, 157–175. Google ScholarDigital Library
Arjun Narayan and Andreas Haeberlen. 2012. DJoin: Differentially Private Join Queries over Distributed Databases.. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). USENIX, Hollywood, CA, 149–162. https://www .usenix.org/conference/osdi12/technical-sessions/presentation/narayanGoogle Scholar
Arvind Narayanan and Vitaly Shmatikov. 2008. Robust De-anonymization of Large Sparse Datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy. 111–125. Google ScholarDigital Library
Jason Reed and Benjamin C. Pierce. 2010. Distance Makes the Types Grow Stronger: A Calculus for Differential Privacy. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP ’10). ACM, New York, NY, USA, 157–168. Google ScholarDigital Library
Ryan M. Rogers, Aaron Roth, Jonathan Ullman, and Salil P. Vadhan. 2016. Privacy Odometers and Filters: Pay-as-you-Go Composition. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 1921–1929.Google Scholar
Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: Security and Privacy for MapReduce. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation (NSDI’10), Vol. 10. USENIX Association, Berkeley, CA, USA, 297–312. http://dl .acm.org/citation.cfm?id=1855711.1855731Google Scholar
Ryan Singel. 2009. Netflix Spilled Your “Brokeback Mountain” Secret, Lawsuit Claims. (Dec. 2009). Wired, http:// www .wired.com/2009/12/netflix-privacy-lawsuit/ .Google Scholar
Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10, 5 (Oct. 2002), 557–570. Google ScholarDigital Library
Ryan J. Tibshirani. 2015. A General Framework for Fast Stagewise Algorithms. The Journal of Machine Learning Research 16, 1 (Jan. 2015), 2543–2588. http://dl .acm.org/citation.cfm?id=2789272.2912080Google Scholar
UCI KDD Archive. 2016. 1990 US Census Data Set. (2016). Available from https://kdd .ics.uci.edu/databases/census1990/ USCensus1990raw .html .Google Scholar
Philip Wadler. 1990. Linear Types Can Change the World. In IFIP TC 2 Working Conference on Programming Concepts and Methods, Sea of Galilee, Israel. 546–566.Google Scholar

Index Terms

A framework for adaptive differential privacy
1. Software and its engineering
  1. Software notations and tools
    1. Formal language definitions
      1. Semantics
2. Theory of computation
  1. Semantics and reasoning
    1. Program semantics
      1. Operational semantics
  2. Theory and algorithms for application domains
    1. Database theory
      1. Theory of database privacy and security

Recommendations

A privacy framework: indistinguishable privacy
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 Workshops

In this paper we illustrate a privacy framework named Indistinguishable Privacy. Indistinguishable privacy could be deemed as the formalization of the existing privacy definitions in privacy preserving data publishing as well as secure multi-party ...
Read More
A Novel Differential Privacy Approach that Enhances Classification Accuracy
C3S2E '16: Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering

In the recent past, there has been a tremendous increase of large repositories of data, examples being in healthcare data, consumer data from retailers, and airline passenger data. These data are continually being shared with interested parties, either ...
Read More
Differential Privacy: Now it's Getting Personal
POPL '15: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages

Differential privacy provides a way to get useful information about sensitive data without revealing much about any one individual. It enjoys many nice compositionality properties not shared by other approaches to privacy, including, in particular, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Programming Languages Volume 1, Issue ICFP
September 2017
1173 pages
EISSN:2475-1421
DOI:10.1145/3136534
Issue’s Table of Contents

Copyright © 2017 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 August 2017
Published in pacmpl Volume 1, Issue ICFP

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available
- Artifacts Evaluated & Functional
Author Tags
Adaptivity
Case Study
Differential Privacy
Fuzz
Privacy Filter
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 26
  Total Citations
  View Citations
- 1,385
  Total Downloads
- Downloads (Last 12 months)184
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A framework for adaptive differential privacy

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A privacy framework: indistinguishable privacy

A Novel Differential Privacy Approach that Enhances Classification Accuracy

Differential Privacy: Now it's Getting Personal