Share this page
  • Share this page on Twitter Share this page on Facebook Share this page on Digg Share this page on Del.icio.us Read the Inside Microsoft Research blog
  • E-mail this page Print this page
  • RSS feeds
Home > Events > India Theory Day 2012
India Theory Day 2012

India Theory Day is a one-day workshop series on Theoretical Computer Science. This workshop series, jointly organized by Microsoft Research India and the Indian Institute of Science (IISc), Bangalore, consists of a day of research talks by eminent researchers in Theoretical Computer Science. The talks are intended to expose the audience to recent research advances and inspire young researchers, as well as stimulate new directions in research in this field.

MSR India Theory Day 2012 was held at the MRC auditorium in the Indian Institute of Science, Bangalore, on 05 January 2012. The day long agenda included talks by eminent speakers like Eva Tardos, Michael Saks, S.Muthukrishnan, Amit Kumar, Nikhil Bansal and Nisheeth Vishnoi. The event attracted over 130 attendees comprising Graduate and PhD students and faculty  from various institutes from across the country. Over 50 pecent of the attendees were from outside Bangalore.

TALK ABSTRACTS:

Eva Tardos:

Title: Welfare and Revenue in Ad Auctions

Abstract: Auctions for selling advertisement space has been the main mechanism used to monetize Internet services. These auctions give rise to a number of interesting challenges not traditionally considered by auction theory. Search advertising gives rise of an enormous number of auctions running simultaneously, necessitating the use of extremely simple mechanisms. Traditional auction theory tells us how to design optimal auction (maximizing welfare or revenue), but typically results in designs that are too complex for the web environment. The most commonly used auction format is Generalized Second Price, that is indeed simple and natural, but is not an optimal design.

Over the last 10+ years we have developed a good understanding of many games naturally arising in the context of Internet or web services from the perspective of the resulting social welfare, including a good understanding of games modeling selfish traffic routing, service location, bandwidth sharing among others. In this talk we will consider the Generalized Second Price auction from this perspective, and consider the social welfare and revenue of the equilibria of this game in various setting.

 ============================================================

Michael Saks:

Title: Two very efficient algorithms for approximating the longest increasing subsequence

Abstract: The theory of approximation algorithms for optimization problems developed largely in response to our inability to find efficient exact solutions to NP-hard optimization problems. In recent years, motivated by the increasingly huge size of data sets, researchers have turned to searching for extremely efficient approximation algorithms for computational problems for which exact polynomial time algorithms are known.

One such problem is the classic problem of finding the longest increasing subsequence (LIS) within a given array. Simple O(n log n) algorithms, based on dynamic programming, are known for solving this problem exactly on arrays of length n.

Micheal Saks discussed recent work of C. Seshadhri and myself, in which we obtain very efficient algorithms for approximating the LIS within two different models of computation.

In the standard computational model we obtain an approximation algorithm that runs in time polylogarithmic in the input size. Specifically for any chosen constant c > 0, the algorithm outputs an approximation to the length of the LIS that (with high probability) is accurate to within an additive cn. Previously, the best known polylogarithmic time algorithms could achieve only an additive n/2 approximation. In the data stream model, the input data is not available for random access, but rather arrives as a sequence of data items. The goal is to perform the computation on the data while minimizing the space needed. Our second algorithm for LIS is a streaming algorithm that, given any b > 0, uses only space polylogarithmic in n, and has approximation error at most bD, where D is the distance from monotonicity of the input (the input length minus the length of the LIS).

 ===========================================================

Amit Kumar:

Title : Clustering problems and the $k$-means algorithm

Abstract: Given a set of points $P$ in a $d$-dimensional Euclidean space, the $k$-means clustering problem seeks to find a set $C$ of $k$ centers such that the sum over all points in $P$ of the square of the distance to the closest center in $C$ is minimized. A popular heuristic for this problem is the Lloyd's algorithm -- start with an initial guess of $k$ centers as seeds. Based on these $k$ centers, partition the set of points into $k$ clusters, where each point gets assigned to the closest center. Now, we update the set of centers as the means of each of these clusters. This process is repeated till we get convergence. It turns out that the choice of initial centers is crucial for obtaining good guarantees. Recent work has shown that one can get a good approximation algorithm if the initial centers are chosen from a special distribution, which we call $D^2$-sampling. In this talk, we will begin by surveying this line of work, and show that ideas based on $D^2$-sampling lead to a very simple linear time PTAS (for constant values of $k$) for many clustering problems which are based on optimizing a single objective function.

We also consider a different class of clustering problems which assume an implicit partition of the data points into $k$ clusters which the clustering algorithm is required to recover. Often, these points are generated by a mixture of $k$ probability distributions where the means of the distributions are well-separated, i.e., the distance between the means of any two distributions is at least $\Omega(k)$ standard deviations. We illustrate the power of Lloyd's algorithm by showing that it works (with a suitable choice of initial centers) in such settings if the data points satisfy the following (deterministic) condition -- the projection of any data point onto the line joining its cluster center to any other cluster center is $\Omega(k)$ standard deviations closer to its own center than the other center. Here the notion of standard deviations is based on the spectral norm of the matrix whose rows represent the difference between a point and the mean of the cluster to which it belongs. We show that in many (probabilistic) generative models, the above condition is satisfied with high probability and so we are able to derive most known results for generative models as corollaries of this result.

This is joint work with Ragesh Jaiswal, Ravi Kannan and Sandeep Sen.

 ============================================================

Nikhil Bansal:

Title: Refined approaches to Randomized Rounding

Abstract: Randomized rounding is a classic method to produce an integral 0/1 solution from a fractional one by interpreting the fractions as probabilities.

However, in many situations this rounding is too naive and loses various nice properties that the fractional solution may have possessed. We will survey various dependent rounding approaches developed in recent years that achieve the benefits of randomized rounding while maintaining other desirable properties. In particular, we will see how several recent results such as sub-logarithmic approximation for asymmetric TSP, constructive algorithms for discrepancy minimization and additive guarantees for degree bounded spanning trees can be viewed from this lens.

 ============================================================

Nisheeth Vishnoi:

Title: How Secure are Lattice Based Cryptosystems?

Abstract:
The security of several cryptosystems is based on our apparent inability to solve the following computational problem: given as input a basis B for a lattice and a target vector t, the goal is to find the lattice point closest to t. This problem, referred to as the Closest Vector Problem (CVP), is not only NP-hard, it is hard to approximate to a factor better than 2^{log n/loglog n}, where n is the dimension of the lattice.

However, in cryptographic applications, often B is known in advance and can be pre-processed arbitrarily. Thus, the security of such a cryptosystem actually relies on the following pre-processing version of CVP, called CVPP: here B can be pre-processed arbitrarily and the input consists just of t. Could CVPP be much easier than CVP? For instance, one can compute the shortest vector in the lattice generated by B, a NP-hard problem, for free. Indeed, it was shown by Ahoronov and Regev how to approximate CVPP to within about \sqrt{n} factor; compare this to the best known approximation factor for CVP, which is about 2^n.

In this talk, Nisheeth Vishnoi outlined a result which shows that CVPP is hard to approximate to within a factor of 2^{(log n)^(1-\epsilon)} unless NP is contained in quasi-poly time. Thus, knowing the lattice in advance does not compromise the cryptosystem. A similar result is proved in the setting of error correcting codes where one wants to understand the complexity of decoding when the generator for the code is known in advance and can be pre-processed.

The talk was based on a joint work with Subhash Khot and Preyas Popat.

 =========================================================