Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Projects > Database Privacy
Database Privacy

Research related to privacy issues in data analysis.

Overview

Statistical databases such as those produced by the US Census contain a large volume of illuminating and potentially useful data. They also run the risk of revealing a great deal of specific information about the participants, which participants generally dislike.

Additionally, each individual exists in a myriad of databases around the world, from purchases at online booksellers to medical records at hospitals to records on file with the government. Such databases individually may reveal little about an individual, though when combined they may be quite incriminating.

There is an inherent tradeoff between the utility that databases can offer and the privacy they afford their constituents. We are studying this tradeoff formally, attempting to understand the relationship between privacy and utility, and thereby find a comfortable position between the extremes of fully disclosed and completely withheld data.

  • What is the right formal characterization of privacy in a public database?
  • What sanitization measures should be employed to preserve this privacy?
  • Which data analyses can be performed on this sanitized data?
Policy and Position Papers
Surveys and Invited Talks
Conference Papers

Events

In the media

Related Project: PINQ
  • Privacy Integrated Queries (PINQ)
    Privacy Integrated Queries is a LINQ-like API for computing on privacy-sensitive data sets, while providing guarantees of differential privacy for the underlying records. The research project is aimed at producing a simple, yet expressive language about which differential privacy properties can be efficiently reason and in which a rich collection of analyses can be programmed.