Bay Area Theory Symposium: BATS 2007

Microsoft Research, Silicon Valley Campus
November 30, 2007


The Bay Area Theory Symposium (BATS) is an annual one-day event organized by the theory research groups at UC Berkeley, Stanford, IBM Almaden, and Microsoft Research Silicon Valley Campus. All interested parties are invited!

Location and Date

Location: Microsoft Silicon Valley Campus, Building 1, Galileo Auditorium. Parking is free. Please consult the driving directions here.

Date: Friday, November 30, 2007, 10:30 am -- 6:00 pm.

Registration

BATS is free to all who are interested. However, if you plan to come, please email us at the following address (write your name and affiliation and put "BATS registration" as the subject):

bats07@microsoft.com

Please register by Nov. 25 if possible, so that we can notify reception and plan for food and beverages. (Those registering afterwards should still attend.)


Program


10:30 - 11:00
Coffee and pastries
11:00 - 12:00
Salil Vadhan, Harvard University and UC Berkeley
The complexity of zero knowledge
12:00 - 12:30
Michael W. Mahoney, Yahoo!
Statistical leverage and improved matrix algorithms
12:30 - 2:00
Lunch (provided)
2:00 - 2:30
Udi Wieder, Microsoft
On time-space tradeoffs for approximate nearest neighbor search algorithms
2:30 - 3:30
Dan Boneh, Stanford University
New tools in public-key cryptography
3:30 - 4:00
Henry Lin, UC Berkeley
Decomposing networks, Internet routing, and Polya urns with the power of choice
4:00 - 4:30
Coffee break
4:30 - 5:00
Ying Xu, Stanford University
Estimating sum by weighted sampling
5:00 - 6:00
Ken Clarkson, IBM Almaden
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm


Previous meetings

Please see the BATS permanent webpage for meetings since the 2001 revival of BATS.

Talk Abstracts

The complexity of zero knowledge
Salil Vadhan

Zero-knowledge proofs are interactive protocols whereby one party, the prover, can convince another, the verifier, that some assertion is true with the remarkable property that the verifier learns nothing other than the fact that the assertion being proven is true. In the quarter-century since they were introduced, zero-knowledge proofs have provided one of the most fertile grounds for interaction between cryptography and complexity theory. On one hand, zero-knowledge proofs and their cryptographic applications naturally raise interesting complexity issues, such as characterizing the statements that can be proven in zero knowledge and identifying the minimal complexity assumptions needed to do so. On the other hand, the study of proofs is intimately related to the central questions of complexity theory, and zero knowledge enriches this study by incorporating such intriguing concepts as interaction, randomness, knowledge, and secrecy.

In this talk, we survey some of the complexity-theoretic study of zero-knowledge proofs, including our recent works (with Shien Jin Ong and others) providing exact characterizations and unconditional results about each of the main flavors of zero-knowledge proofs.

Statistical leverage and improved matrix algorithms
Michael W. Mahoney

Given an m x n matrix A and a rank parameter k, define the leverage of the i-th row of A to be the i-th diagonal element of the projection matrix onto the span of the top k left singular vectors of A. Historically, this statistical concept (and generalizations of it) has found extensive applications in, e.g, diagnostic regression analysis. Very recently, this concept has been central in the development of improved algorithms for several fundamental matrix problems. Two examples of this will be described. The first problem is the least squares approximation problem, in which there are n constraints and d variables. Classical algorithms, dating back to Gauss and Legendre, use O(nd^2) time. We describe a randomized algorithm that uses only O(n d log d) time to compute a relative-error, i.e., 1+/-epsilon, approximation. The second problem is the problem of selecting a ``good'' set of exactly k columns from an m x n matrix, and the algorithm of Gu and Eisenstat provides the best previously existing result. We describe a two-stage algorithm that improves on their result, assuming that k is small. Applications of these ideas in Internet data analysis will be briefly described.

New tools in public-key cryptography
Dan Boneh

The talk will survey a new approach to constructing space-efficient cryptosystems based on quadratic forms. These techniques give a new non-interactive key exchange protocol, an oblivious transfer protocol, a new universal hash proof, and an identity-based encryption without pairings.

This is joint work with Craig Gentry and Mike Hamburg.

Decomposing networks, Internet routing, and Polya urns with the power of choice
Henry Lin

The preferential attachment model is a random graph model, which has been used in various contexts to model the Internet.  In this talk, we show that with high probability a graph generated by the preferential attachment model can be decomposed into about sqrt(n) subgraphs of size about sqrt(n) and diameter O(log n), such that the subgraphs cover all nodes and each pair of subgraphs intersect.  Our decomposition enables a very simple compact routing scheme to be defined for preferential attachment graphs.  The main technical challenge in proving the correctness of the routing scheme is to analyze the balancing properties of a random balls and bins process.  In each step of this process, two or more bins are selected with probability proportional to load, and a ball is thrown into the least loaded of the two or more bins selected.  We show that "the power of choice" also balances load distributions in this random balls and bins model.

Joint work with Christos Amanatidis, Richard Karp, Christos Papadimitriou, Martha Sideri.

Estimating sum by weighted sampling
Ying Xu

We study the classic problem of estimating the sum of n variables. The traditional uniform sampling approach requires a linear number of samples to provide any non-trivial guarantee on the estimated sum. In this paper we consider alternative sampling methods; in particular, motivated by the application of estimating web size, we consider weighted sampling where a variable is samped with probability proportional to its value. We develop an algorithm for estimating sum with O(n^{1/2}) samples using weighted sampling, and an algorithm with O(n^{1/3}) samples using both uniform sampling and weighted sampling. More generally, we may allow general weighted sampling where the probability of sampling a variable is proportional to an arbitrary function of its value. We prove a lower bound of Omega(n^{1/3}) samples for general weighted sampling, which implies that our sum estimator is almost optimal.

Joint work with Rajeev Motwani and Rina Panigrahy

Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm
Ken Clarkson

The problem of maximizing a concave function f(x) in a simplex S can be solved approximately by a simple greedy algorithm, that in k steps can find a point x_k such that f(x_k) ≥ f(x_*) - O(1/k), where f(x_*) is the maximum value of f in S. This algorithm and analysis were known before, and related to problems of statistics and machine learning, such as boosting, training support vector machines, regression, and density mixture estimation. In other work, coming from computational geometry, the existence of ε-coresets was shown for the minimum enclosing ball problem, by means of a simple greedy algorithm. Similar greedy algorithms, that are special cases of the Frank-Wolfe algorithm, were described for other enclosure problems. I'll tie these results together, review some  stronger convergence results, and generalize or strengthen some coreset bounds.