Research related to privacy issues in data analysis.
Overview
The problem of statistical disclosure control—revealing accurate statistics about a population while preserving the privacy of individuals—has a venerable history. An extensive literature spans multiple disciplines: statistics, theoretical computer science, security, and databases. Nevertheless, despite this extensive literature, «privacy breaches» are common, both in the literature and in practice, even when security and data integrity are not compromised.
This project revisits private data analysis from the perspective of modern cryptography. We address many previous difficulties by obtaining a strong, yet realizable, definition of privacy. Intuitively, differential privacy ensures that the system behaves the essentially same way, independent of whether any individual, or small group of individuals, opts in to or opts out of the database. More precisely, for every possible output of the system, the probability of this output is almost unchanged by the addition or removal of any individual, where the probabilities are taken over the coin flips of the mechanism (and not the data set). Moreover, this holds even in the face of arbitrary existing or future knowledge available to a «privacy adversary,» completely solving the problem of database linkage attacks.
Databases can serve many social goals, such as fair allocation of resources, and identifying genetic markers for disease. Better participation means better information, and the «in vs out» aspect of differential privacy encourages participation.
For a general overview of differential privacy—the problems to be solved, the defintion, the formal impossibility results that lead to the definition, general techniques for achieving differential privacy, and some recent directions, see «A firm foundation for private data analysis» (to appear in Communications of ACM).
For selected publications organized by topic and chronological ordered scroll down or follow the links:
A comprehensive list of papers related to the project appears here:
- All publications (navigates away from this page)
- Cynthia Dwork, Differential Privacy, in 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006), Springer Verlag, Venice, Italy, July 2006
- Cynthia Dwork, Differential Privacy: A Survey of Results, in Theory and Applications of Models of Computation—TAMC, Springer Verlag, April 2008
- Cynthia Dwork, The Differential Privacy Frontier, in 6th Theory of Cryptography Conference, TCC 2009, Springer Verlag, San Francisco, CA, March 2009
- Cynthia Dwork, Differential Privacy in New Settings, in Symposium on Discrete Algorithms (SODA), Society for Industrial and Applied Mathematics, January 2010
- Frank McSherry, Privacy Integrated Queries, in Communications of the ACM, Association for Computing Machinery, Inc., 1 September 2010
- Cynthia Dwork, A Firm Foundation for Private Data Analysis, in Communications of the ACM, Association for Computing Machinery, Inc., January 2011
- Cynthia Dwork, The Promise of Differential Privacy. A Tutorial on Algorithmic Techniques., in 52nd Annual IEEE Symposium on Foundations of Computer Science, October 2011
- Cynthia Dwork, Differential Privacy, in 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006), Springer Verlag, Venice, Italy, July 2006
- Ilya Mironov, Omkant Pandey, Omer Reingold, and Salil Vadhan, Computational Differential Privacy, in Advances in Cryptology—CRYPTO 2009, Springer, August 2009
- Cynthia Dwork and Kobbi Nissim, Privacy-Preserving Datamining on Vertically Partitioned Databases, in 24th Annual International Cryptology Conference (CRYPTO 2004), Springer Verlag, Santa Barbara, California, USA, August 2004
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith, Calibrating Noise to Sensitivity in Private Data Analysis, in Third Theory of Cryptography Conference (TCC 2006), Springer, New York, NY, USA, March 2006
- Frank McSherry and Kunal Talwar, Mechanism Design via Differential Privacy, in Annual IEEE Symposium on Foundations of Computer Science (FOCS), IEEE, Providence, RI, October 2007
- Cynthia Dwork and Jing Lei, Differential Privacy and Robust Statistics, in Proceedings of the 41th Annual ACM Symposium on Theory of Computing (STOC), Association for Computing Machinery, Inc., Bethesda, Maryland, May 2009
- Cynthia Dwork, Moni Naor, Omer Reingold, Guy Rothblum, and Salil Vadhan, On the Complexity of Differentially Private Data Release: Efficient Algorithms and Hardness Results, in Proceedings of the 41th Annual ACM Symposium on Theory of Computing (STOC), Association for Computing Machinery, Inc., Bethesda, Maryland, May 2009
- Moritz Hardt and Kunal Talwar, On the Geometry of Differential Privacy, in STOC, Association for Computing Machinery, Inc., June 2010
- Avrim Blum, Cynthia Dwork, Frank McSherry, and Kobbi Nissim, Practical Privacy: The SuLQ Framework, in 24th ACM SIGMOD International Conference on Management of Data / Principles of Database Systems, Baltimore (PODS 2005), Baltimore, Maryland, USA, June 2005
- Frank McSherry, Privacy Integrated Queries, in Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD), Association for Computing Machinery, Inc., June 2009
- Lars Backstrom, Cynthia Dwork, and Jon M. Kleinberg, Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography, in International conference on World Wide Web (WWW), ACM, Banff, Alberta, Canada, May 2007
- Cynthia Dwork, Frank McSherry, and Kunal Talwar, The price of privacy and the limits of LP decoding, in Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC), Association for Computing Machinery, Inc., San Diego, California, USA, June 2007
- Cynthia Dwork and Sergey Yekhanin, New Efficient Attacks on Statistical Disclosure Control Mechanisms, in Advances in Cryptology—CRYPTO 2008, Springer Verlag, August 2008
- Andrew McGregor, Ilya Mironov, Toniann Pitassi, Omer Reingold, Kunal Talwar, and Salil Vadhan, The Limits of Two-Party Differential Privacy, in 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS 2010), Institute of Electrical and Electronics Engineers, Inc., October 2010
- Shuchi Chawla, Cynthia Dwork, Frank McSherry, and Kunal Talwar, On Privacy-Preserving Histograms, in Uncertainty in Artificial Intelligence (UAI), Association for Uncertainty in Artificial Intelligence, Edinburgh, Scotland, July 2005
- Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor, Our Data, Ourselves: Privacy Via Distributed Noise Generation, in Advances in Cryptology (EUROCRYPT 2006), Springer Verlag, Saint Petersburg, Russia, May 2006
- Philippe Golle, Frank McSherry, and Ilya Mironov, Data Collection With Self-Enforcing Privacy, in ACM Conference on Computer and Communications Security (CCS 2006), ACM, October 2006
- Frank McSherry and Ilya Mironov, Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders, in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Association for Computing Machinery, Inc., June 2009
- Anupam Gupta, Katrina Ligett, Frank McSherry, Aaron Roth, and Kunal Talwar, Differentially Private Combinatorial Optimization, in SODA, Society for Industrial and Applied Mathematics, January 2010
- Cynthia Dwork, Moni Naor, Toniann Pitassi, Guy N. Rothblum, and Sergey Yekhanin, Pan-Private Streaming Algorithms, in Proceedings of The First Symposium on Innovations in Computer Science (ICS 2010), Tsinghua University Press, January 2010
- Cynthia Dwork, Moni Naor, Toniann Pitassi, and Guy N. Rothblum, Differential Privacy Under Continual Observation, in STOC '10: Proceedings of the 42nd ACM symposium on Theory of computing, Association for Computing Machinery, Inc., June 2010
- Committee on Technical and Privacy Dimensions of Information for Terrorism Prevention and Other National Goals and National Research Council, Protecting Individual Privacy in the Struggle Against Terrorists: A Framework for Program Assessment, National Academies Press, 26 September 2008
Events
- Privacy Workshop, October 10–11, 2011. iDASH, La Jolla, CA
- Statistical and Learning-Theoretic Challenges in Data Privacy, February 22–26, 2010. Institute for Pure and Applied Mathematics (IPAM), Los Angeles, CA
-
MindSwap on Privacy Technology, October 19–20, 2007. Center for Computational Thinking, Carnegie Mellon, Pittsburgh, PA
-
Workshop on Data Confidentiality, September 6–7, 2007, Arlington, VA
-
CS-Statistics Workshop On Privacy and Confidentiality, July 9–15, 2005, Bertinoro, Italy
-
DIMACS/PORTIA Workshop on Privacy-Preserving Data Mining, March 15–16, 2004, DIMACS Center, Rutgers University, Piscataway, NJ
In the media
- Thomas Claburn, Counterterrorist Data Mining Needs Privacy Protection, InformationWeek, Oct. 7, 2008
- Samuel Greengard, Privacy Matters, Communications of the ACM, vol. 51(9), Sep. 2008
- Dean Takahashi, Let's not give up on our privacy, San Jose Mercury News, Dec. 21, 2006
- Privacy Integrated Queries (PINQ)Privacy Integrated Queries is a LINQ-like API for computing on privacy-sensitive data sets, while providing guarantees of differential privacy for the underlying records. The research project is aimed at producing a simple, yet expressive language about which differential privacy properties can be efficiently reasoned and in which a rich collection of analyses can be programmed.



