Privacy Integrated Queries is a LINQ-like API for computing on privacy-sensitive data sets, while providing guarantees of differential privacy for the underlying records. The research project is aimed at producing a simple, yet expressive language about which differential privacy properties can be efficiently reasoned and in which a rich collection of analyses can be programmed.
Substiantial progress has been recently made in the rigorous treatment of privacy-preserving data analysis, in the form of Differential Privacy: a formal and achievable requirement that a computation not reveal even the presence of any one individual in its input. As powerful as this privacy criterion is, its formal nature challenges data analysts and data providers to design new analyses and verify their privacy properties without the help of differential privacy experts.
Privacy Integrated Queries is a programming language and execution platform in which all expressible programs satisfy differential privacy. A data analyst and data provider can be convinced of the privacy properties of an analysis simply by its expression in PINQ. The interface PINQ exposes to the analyst, and the interface it requires of the source data, is simply that of Language Integrated Queries (LINQ); both analysts and providers can get started using PINQ without any complicated infrastructure, and without any specialized privacy training.
Getting and Using PINQ
The PINQ prototype is currently available for download. The distribution contains a functional implementation of the current iteration of the PINQ language, as well as execution middleware that ensures differential privacy against non-malicious users. This implementation is suitable for experimentation and prototyping, but is not intended as industrial strength privacy technology.
The PINQ distribution contains several example applications, demonstrating the key differences from LINQ, both in terms of functionality removed (for privacy reasons) as well as new functionality added (for privacy reasons). A PINQ tutorial is available, and is growing as time and suitable examples present themselves. The technical paper describing PINQ also contains many useful discussions about its intended functionality, and why possibly non-obvious design decisions were made as they were.
- Frank McSherry, Privacy Integrated Queries, in Communications of the ACM, Association for Computing Machinery, Inc., 1 September 2010
- Frank McSherry and Ratul Mahajan, Differentially-Private Network Trace Analysis, in Proceedings of SIGCOMM 2010, Association for Computing Machinery, Inc., 30 August 2010
- Frank McSherry, Privacy Integrated Queries, in Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD), Association for Computing Machinery, Inc., June 2009
- Database Privacy
Research related to privacy issues in data analysis.