Seminar on Jan. 18, 2011

Date: Jan. 18, 2011

Venue: Microsoft Shanghai Zizhu Campus, Building 01, Room 5019

Schedule of talks:

11:00 am

Speaker: Guohua Wu (NTU)

Title: Automatic Sequences and Transcendence of Real Numbers

Abstract:

In this talk, I will first review basics of automatic sequences, and then discuss transendence properties of the corresponding real numbers. Cobham Theorem and Mahler's will also be introduced.

Lunch

2:00 pm

Speaker: David Xiao (Université Paris-Sud)

Title: Is privacy compatible with truthfulness?

Abstract:

We investigate the mainstream interpretation of differential privacy, which says that given a differentially private mechanism, people are likely to share their data truthfully because they are at little risk of revealing their own information. We argue that this interpretation is incomplete because people attach a concrete value to their privacy, and so they should be explicitly motivated to reveal their information. Using the notion of truthfulness from game theory, we show that in certain settings, existing differentially private mechanisms do not encourage participants to report their information truthfully. On the positive side, we exhibit a transformation that, for games where the type space is small and the goal is to optimize social welfare, takes truthful and efficient mechanisms and transform them into differentially private, truthful, and efficient mechanisms.

We also study what happens when an explicit numerical cost is assigned to the information leaked by a mechanism. We show that in this case, even differential privacy may not be strong enough of a notion to motivate people to participate truthfully.

3:00 pm

Speaker: Yuan Zhou (CMU)

Title: Optimal lower bounds for locality sensitive hashing (except when q is tiny)

Abstract:

Lower bounds for Locality Sensitive Hashing (LSH) in the strongest setting: point sets in {0,1}^d under the Hamming distance. Recall that here H is said to be an (r, cr, p, q)-sensitive hash family if all pairs x,y in {0,1}^d with dist(x,y) \leq r have probability at least p of collision under a randomly chosen h in H, whereas all pairs x,y in {0,1}^d with dist(x,y) \geq cr have probability at most q of collision. Typically, one considers d \to infty, with c > 1 fixed and q bounded away from 0.

For its applications to approximate nearest neighbor search in high dimensions, the quality of an LSH family H is governed by how small its "rho parameter" rho = ln(1/p)/ln(1/q) is as a function of the parameter c. The seminal paper of Indyk and Motwani showed that for each c \geq 1, the extremely simple family H = {x \mapsto x_i : i in d} achieves rho \leq 1/c. The only known lower bound, due to Motwani, Naor, and Panigrahy, is that rho must be at least (e^{1/c}-1)/(e^{1/c}+1)\geq .46/c (minus o_d(1)).

In this paper we show an optimal lower bound: rho must be at least 1/c (minus o_d(1)). This lower bound for Hamming space yields a lower bound of 1/c^2 for Euclidean space (or the unit sphere) and 1/c for the Jaccard distance on sets; both of these match known upper bounds. Our proof is simple; the essence is that the noise stability of a boolean function at e^{-t} is a log-convex function of t.

Like the Motwani--Naor--Panigrahy lower bound, our proof relies on the assumption that q is not `tiny', meaning of the form 2^{-Theta(d)}. Some lower bound on q is always necessary, as otherwise it is trivial to achieve rho = 0. The range of q for which our lower bound holds is the same as the range of q for which rho accurately reflects an LSH family's quality. Still, we conclude by discussing why it would be more satisfying to find LSH lower bounds that hold for tiny q.

This is joint work with Ryan O'Donnell of Carnegie Mellon and Yi Wu of IBM Research.