*
Quick Links|Home|Worldwide
Microsoft*
Search for


Database Privacy

Overview

Statistical databases such as are produced by the US Census contain a large volume of illuminating and potentially useful data. They also run the risk of revealing a great deal of specific information about the participants, which participants generally dislike.

Additionally, each individual exists in a myriad of databases around the world, from purchases at online booksellers to medical records at hospitals to records on file with the government. Such databases individually may reveal little about an individual, though when combined they may be quite incriminating.

There is an inherent tradeoff between the utility that databases can offer and the privacy they afford their constituents. We are studying this tradeoff formally, attempting to understand the relationship between privacy and ulitily, and thereby find a comfortable position between the extremes of fully disclosed and completely withheld data.

  • What is the right formal characterization of privacy in a public database?
  • What sanitization measures should be employed to preserve this privacy?
  • Which data analyses can be performed on this sanitized data?

We are exploring two different computational models for statistical databases. In the "census" model, the data are sanitized and the results are published; the adversary has arbitrary access to the published data. In the "output perturbation" model, the adversary may make a limited number of queries to the database. In response, the true answer to each query is computed, and then perturbed, by adding random noise. Only the perturbed value is released. So far, the latter model has proved more tractable, as the adversary's access to data is limited. In this model, provided the adversary is restricted to a number of queries that is sublinear in the number of database rows -- a reasonable assumption if the database is very large, privacy can be achieved at virtually no loss in statistical accuracy. The census model, on the other hand, seems to capture more profound questions.

Project Members

Project Visitors

  • Boaz Barak
  • Avrim Blum
  • Kamalika Chaudhuri
  • Shuchi Chawla
  • Petros Drineas
  • Satyen Kale
  • Krishnaram Kenthapadi
  • Moni Naor
  • Kobbi Nissim
  • Adam Smith
  • Madhu Sudan
  • Hoeteck Wee

Publications

Events

Associated Groups
 


©2008 Microsoft Corporation. All rights reserved. Terms of Use |Trademarks |Privacy Statement