skip to main content
research-article
Open Access
Artifacts Available
Artifacts Evaluated & Functional

A framework for adaptive differential privacy

Published:29 August 2017Publication History
Skip Abstract Section

Abstract

Differential privacy is a widely studied theory for analyzing sensitive data with a strong privacy guarantee—any change in an individual's data can have only a small statistical effect on the result—and a growing number of programming languages now support differentially private data analysis. A common shortcoming of these languages is poor support for adaptivity. In practice, a data analyst rarely wants to run just one function over a sensitive database, nor even a predetermined sequence of functions with fixed privacy parameters; rather, she wants to engage in an interaction where, at each step, both the choice of the next function and its privacy parameters are informed by the results of prior functions. Existing languages support this scenario using a simple composition theorem, which often gives rather loose bounds on the actual privacy cost of composite functions, substantially reducing how much computation can be performed within a given privacy budget. The theory of differential privacy includes other theorems with much better bounds, but these have not yet been incorporated into programming languages.

We propose a novel framework for adaptive composition that is elegant, practical, and implementable. It consists of a reformulation based on typed functional programming of privacy filters, together with a concrete realization of this framework in the design and implementation of a new language, called Adaptive Fuzz. Adaptive Fuzz transplants the core static type system of Fuzz to the adaptive setting by wrapping the Fuzz typechecker and runtime system in an outer adaptive layer, allowing Fuzz programs to be conveniently constructed and typechecked on the fly. We describe an interpreter for Adaptive Fuzz and report results from two case studies demonstrating its effectiveness for implementing common statistical algorithms over real data sets.

Skip Supplemental Material Section

Supplemental Material

References

  1. Rakesh Agrawal and Ramakrishnan Srikant. 2000. Privacy-preserving Data Mining. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD ’00). ACM, New York, NY, USA, 439–450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Marc Andrysco, David Kohlbrenner, Keaton Mowery, Ranjit Jhala, Sorin Lerner, and Hovav Shacham. 2015. On Subnormal Floating Point and Abnormal Timing. In Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP ’15). IEEE Computer Society, Washington, DC, USA, 623–639. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Gilles Barthe, Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Aaron Roth, and Pierre-Yves Strub. 2015. Higher-order approximate relational refinement types for mechanism design and differential privacy. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, 55–68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Gilles Barthe, Boris Köpf, Federico Olmedo, and Santiago Zanella Béguelin. 2012. Probabilistic Relational Reasoning for Differential Privacy. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’12), Vol. 47. ACM, ACM, New York, NY, USA, 97–110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Raef Bassily, Adam Smith, and Abhradeep Thakurta. 2014. Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014. IEEE Computer Society, 464–473. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006a. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology-EUROCRYPT 2006. Springer, 486–503. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cynthia Dwork and Jing Lei. 2009. Differential privacy and robust statistics. In Proceedings of the forty-first annual ACM symposium on Theory of computing. ACM, 371–380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006b. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography. Springer, 265–284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science 9, 3-4 (2014), 211–407.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cynthia Dwork, Guy N Rothblum, and Salil Vadhan. 2010. Boosting and differential privacy. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on. IEEE, 51–60.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hamid Ebadi and David Sands. 2015. Featherweight PINQ. CoRR abs/1505.02642 (2015). http://arxiv .org/abs/1505.02642Google ScholarGoogle Scholar
  12. Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. 2002. Privacy Preserving Mining of Association Rules. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’02). ACM, New York, NY, USA, 217–228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Marco Gaboardi, Andreas Haeberlen, Justin Hsu, Arjun Narayan, and Benjamin C Pierce. 2013. Linear dependent types for differential privacy. In ACM SIGPLAN Notices, Vol. 48. ACM, 357–370.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Marco Gaboardi, James Honaker, Gary King, Kobbi Nissim, Jonathan Ullman, and Salil Vadhan. Working Paper. PSI (Îĺ): a Private data Sharing Interface. In Theory and Practice of Differential Privacy. New York, NY.Google ScholarGoogle Scholar
  15. Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, and Adam Smith. 2008. Composition Attacks and Auxiliary Information in Data Privacy. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’08). ACM, New York, NY, USA, 265–273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Andreas Haeberlen, Benjamin C. Pierce, and Arjun Narayan. 2011. Differential Privacy Under Fire. In Proceedings of the 20th USENIX Conference on Security (SEC’11). USENIX Association, Berkeley, CA, USA, 33–33. http://dl .acm.org/ citation .cfm?id=2028067.2028100Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Hsu, M. Gaboardi, A. Haeberlen, S. Khanna, A. Narayan, B. C. Pierce, and A. Roth. 2014. Differential Privacy: An Economic Method for Choosing Epsilon. In 2014 IEEE 27th Computer Security Foundations Symposium. 398–410. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Aaron Johnson and Vitaly Shmatikov. 2013. Privacy-preserving Data Exploration in Genome-wide Association Studies. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’13). ACM, New York, NY, USA, 1079–1087. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2015. The Composition Theorem for Differential Privacy. In Proceedings of The 32nd International Conference on Machine Learning. 1376–1385.Google ScholarGoogle Scholar
  20. Shiva P Kasiviswanathan and Adam Smith. 2014. On the’Semantics’ of Differential Privacy: A Bayesian Formulation. Journal of Privacy and Confidentiality 6, 1 (2014), 1.Google ScholarGoogle ScholarCross RefCross Ref
  21. Daniel Kifer. 2009. Attacks on Privacy and deFinetti’s Theorem. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD ’09). ACM, New York, NY, USA, 127–138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. 2007. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In 2007 IEEE 23rd International Conference on Data Engineering. 106–115. Google ScholarGoogle ScholarCross RefCross Ref
  23. Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. 2006. ℓ-Diversity: Privacy Beyond κ-Anonymity. In Proceedings of the 22Nd International Conference on Data Engineering (ICDE ’06). IEEE Computer Society, Washington, DC, USA, 24–. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Frank McSherry and Ilya Mironov. 2009. Differentially Private Recommender Systems: Building Privacy into the Net. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’09). ACM, New York, NY, USA, 627–636. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Frank D McSherry. 2009. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, 19–30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Elinor Mills. 2006. AOL sued over Web search data release. (Sept. 2006). cnet, http://www .cnet.com/news/aol-sued-overweb- search- data- release/ .Google ScholarGoogle Scholar
  27. Ilya Mironov. 2012. On Significance of the Least Significant Bits for Differential Privacy. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS ’12). ACM, New York, NY, USA, 650–661. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Prashanth Mohan, Abhradeep Thakurta, Elaine Shi, Dawn Song, and David Culler. 2012. GUPT: Privacy Preserving Data Analysis Made Easy. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD ’12). ACM, New York, NY, USA, 349–360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jack Murtagh and Salil Vadhan. 2016. The Complexity of Computing the Optimal Composition of Differential Privacy. In Theory of Cryptography. Springer, 157–175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Arjun Narayan and Andreas Haeberlen. 2012. DJoin: Differentially Private Join Queries over Distributed Databases.. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). USENIX, Hollywood, CA, 149–162. https://www .usenix.org/conference/osdi12/technical-sessions/presentation/narayanGoogle ScholarGoogle Scholar
  31. Arvind Narayanan and Vitaly Shmatikov. 2008. Robust De-anonymization of Large Sparse Datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy. 111–125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jason Reed and Benjamin C. Pierce. 2010. Distance Makes the Types Grow Stronger: A Calculus for Differential Privacy. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP ’10). ACM, New York, NY, USA, 157–168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ryan M. Rogers, Aaron Roth, Jonathan Ullman, and Salil P. Vadhan. 2016. Privacy Odometers and Filters: Pay-as-you-Go Composition. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 1921–1929.Google ScholarGoogle Scholar
  34. Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: Security and Privacy for MapReduce. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation (NSDI’10), Vol. 10. USENIX Association, Berkeley, CA, USA, 297–312. http://dl .acm.org/citation.cfm?id=1855711.1855731Google ScholarGoogle Scholar
  35. Ryan Singel. 2009. Netflix Spilled Your “Brokeback Mountain” Secret, Lawsuit Claims. (Dec. 2009). Wired, http:// www .wired.com/2009/12/netflix-privacy-lawsuit/ .Google ScholarGoogle Scholar
  36. Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10, 5 (Oct. 2002), 557–570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Ryan J. Tibshirani. 2015. A General Framework for Fast Stagewise Algorithms. The Journal of Machine Learning Research 16, 1 (Jan. 2015), 2543–2588. http://dl .acm.org/citation.cfm?id=2789272.2912080Google ScholarGoogle Scholar
  38. UCI KDD Archive. 2016. 1990 US Census Data Set. (2016). Available from https://kdd .ics.uci.edu/databases/census1990/ USCensus1990raw .html .Google ScholarGoogle Scholar
  39. Philip Wadler. 1990. Linear Types Can Change the World. In IFIP TC 2 Working Conference on Programming Concepts and Methods, Sea of Galilee, Israel. 546–566.Google ScholarGoogle Scholar

Index Terms

  1. A framework for adaptive differential privacy

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader