Microsoft Research India Theory Day 2008

Microsoft Research India (MSR) Theory Day 2008 featured a one-day workshop on Theoretical Computer Science, jointly organized by Microsoft Research India and the Indian Institute of Science (IIsc), Bangalore.

MSR India Theory Day 2008 consisted of a day of research talks by eminent researchers in Theoretical Computer Science – László Lovász, Manindra Agrawal, P.S. Thiagarajan and Vijay Vazirani. There were talks by researchers from MSR India - Manik Varma and Aditya Nori. The talks were intended to expose the audience to recent research advances and inspire young researchers, as well as stimulate new directions in research in this field. The talks were also of great value to faculty and established researchers as a first-hand update.

# Speakers

László Lovász, Eotvos Lorand University
P.S. Thiagarajan, National University of Singapore
Manik Varma, Microsoft Research India
Manindra Agrawal, Indian Institute of Technology Kanpur
Vijay V. Vazirani, Georgia Institute of Technology

# Organizing Committee

Ravi Kannan, Microsoft Research India
Satya Lokam, Microsoft Research India
G. Rangarajan, Indian Institute of Science

# Talk Abstracts

Some mathematics behind graph property testing - László Lovász

Graph property testing is the third reincarnation of the same general question, after statistics and learning theory. In its simplest form, we have a huge graph (we don't even know its size), and we draw a sample of the node set of bounded size. What properties of the graph can be deduced from this sample?

Ther graph property testing model was first introduced by Goldreich, Goldwasser and Ron (but related questions were considered before). In the context of dense graphs, a very general result is due to Alon and Shapira, who proved that every hereditary graph property is testable.

Using the theory of graph limits, Lovasz and Szegedy defined an analytic version of the (dense) graph property testing problem, which can be formulated as studying an unknown 2-variable symmetric function through sampling from its domain and studying the random graph obtained when using the function values as edge probabilities. This analytic version allows for simpler formulation of the problems, and leads to various characterizations of testable properties. These results can be applied to the original graph-theoretic property testing. In particular, they lead to a new combinatorial characterization of testable graph properties.

We survey these results, along with analogous results for graphs with bounded degree.

Approximating Bio-pathways Dynamics - P.S. Thiagarajan

Signaling pathways constitute a fundamental class of intra-cellular processes. A multitude of signaling pathways govern and coordinate the behavior of cells and the malfunctioning of theses pathways govern diseases such as diabetes and cancer. Hence the study of signaling pathways is of critical importance.

A standard formalism used to model signaling pathways (and other bio-pathways) is a system of Ordinary Differential Equations (ODEs); the equations will describe specific bio-chemical reactions while the variables will typically represent protein concentration levels.

Signaling pathways usually involve a large number of molecular species and bio-chemical reactions.

Hence the corresponding ODEs will not yield closed form solutions and in fact, will be difficult to study even via numerical solutions. We propose a probabilistic approximation by first sampling a representative set of trajectories and then exploit the structure of the pathway to encode these trajectories compactly as a Bayesian network (BN). As a result, many interesting dynamic properties can be analyzed efficiently through standard Bayesian inference techniques, instead of resorting to a large number of ODE simulations. Our preliminary results are very promising in terms of both accuracy and efficiency.

Multiple Kernel Learning - Manik Varma

Support Vector Machines (SVMs) are basic tools in machine learning that are used for classification, regression, ranking, etc. They find applications in numerous fields including vision, natural language processing, bioinformatics, information retrieval, search and software engineering. The area of Multiple Kernel Learning (MKL) is concerned with learning the kernel in an SVM setting. In particular, it focuses on how the kernel can be learnt as an optimal linear combination of given base kernels. Many MKL formulations have been proposed in the literature with associated optimizations varying from Semi-definite Programming to Semi-infinite Linear Programming. In this talk, we develop a general formulation that can be optimized extremely efficiently via gradient descent. Our proposed formulation extends traditional MKL by going beyond convex linear combinations of base kernels. It also handles varied loss functions and regularizers.

To demonstrate the power of the proposed formulation, we apply our MKL algorithm to challenging, benchmark problems in computer vision and show greatly improved performance over the state-of-the-art. For instance, in object recognition, classification accuracy on the Caltech 101 database now stands at over 90% and we also won the Caltech 256 challenge by getting a classification accuracy of over 60%. We also demonstrate the versatility of our MKL algorithm by using it to predict facial attractiveness as well as learning minimal sets of discriminative pixels and image regions for classification.

Proving Lower Bounds on Arithmetic Circuits - Manindra Agrawal

We survey the results known for arithmetic circuit lower bounds and show that to obtain strong lower bounds it is enough to derandomize the Identity Testing problem on a restricted class of depth four circuits.

Proofs from Tests - Aditya Nori

We present an algorithm DASH to check if a program P satisfies a safety property φ. The unique feature of the algorithm is that it uses only test generation operations and it refines and maintains a sound program abstraction as a consequence of failed test generation operations. Thus, each iteration of the algorithm is inexpensive, and can be implemented without any global may-alias information. In particular, we introduce a new refinement operator WPα that uses only the alias information obtained by executing a test to refine abstractions in a sound manner. We present a full exposition of the DASH algorithm, its theoretical properties, and its implementation in a software property checking tool called YOGI.

Nash Bargaining via Flexible Budget Markets - Vijay V. Vazirani

In his seminal 1950 paper, John Nash defined the bargaining game; the ensuing theory of bargaining lies at the heart of game theory. In this work, we initiate an algorithmic study of Nash bargaining games.

For a certain class of Nash bargaining games, we show that they can be transformed into a market model whose equilibrium yields the solution to the Nash bargaining game. This market model is a natural variant of a traditional market model from mathematical economics, dating back to 1891.

We then extend techniques developed within theoretical computer science over the last seven years to give a combinatorial, polynomial time algorithm for computing an equilibrium for this market model. Over the years, a fascinating theory has started forming around a convex program given by Eisenberg and Gale in 1959. Besides market equilibria, this theory touches on such disparate themes as TCP congestion control and efficient solvability of nonlinear programs by combinatorial means. Our work shows that the Nash bargaining problem fits harmoniously in this collage of ideas.

# Contact Information

For any queries regarding the programme, please send an email to indiaerp@microsoft.com.

# Previous Edition of Theory Day

Microsoft Research India Theory Day - December 2007