Shuchi Chawla, Cynthia Dwork, Frank McSherry, Adam Smith, and Hoeteck Wee
We initiate a theoretical study of the census problem. Informally, in a census individual respondents give private information to a trusted party (the census bureau), who publishes a sanitized version of the data. There are two fundamentally conﬂicting requirements: privacy for the respondents and utility of the sanitized data. Unlike in the study of secure function evaluation, in which privacy is preserved to the extent possible given a speciﬁc functionality goal, in the census problem privacy is paramount; intuitively, things that cannot be learned “safely” should not be learned at all.
An important contribution of this work is a deﬁnition of privacy (and privacy compromise) for statistical databases, together with a method for describing and comparing the privacy oﬀered by speciﬁc sanitization techniques. We obtain several privacy results using two diﬀerent sanitization techniques, and then show how to combine them via cross training. We also obtain two utility results involving clustering.
|Published in||Second Theory of Cryptography Conference, (TCC 2005)|
|Series||Lecture Notes in Computer Science|
|Address||Cambridge, MA, USA|
All copyrights reserved by Springer 2007.