Together with Josh Tenenbaum and Sham Kakade, we started the New England Machine Learning Day. I'm chairing NEML 2015
.

James Zou, Kamalika Chaudhuri, and Adam Kalai
Crowdsourcing Feature Discovery via Adaptively Chosen Comparisons
arXiv:1504.00064, 2015.
abstract pdf
We introduce an unsupervised approach to efficiently discover the underlying features in a data set via crowdsourcing.
Our queries ask crowd members to articulate a feature common to two out of three displayed examples.
In addition, we also ask the crowd to provide binary labels to the remaining examples based on the discovered features.
The triples are chosen adaptively based on the labels of the previously discovered features on the data set.
In two natural models of features, hierarchical and independent, we show that a simple adaptive algorithm, using such "twooutofthree" similarity queries,
recovers all features with less labor than any non adaptive algorithm. Experimental results validate the theoretical ﬁndings.

Peter Organisciak, Jaime Teevan, Susan Dumais, Robert C. Miller, and Adam Kalai
A Crowd of Your Own: Crowdsourcing for OnDemand Personalization
In Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2014). Pittsburg, PA. November 24, 2014. Notable paper award.
abstract pdf
Personalization is a way for computers to support people's
diverse interests and needs by providing content tailored to
the individual. While strides have been made in algorithmic
approaches to personalization, most require access to a
significant amount of data. However, even when data is
limited online crowds can be used to infer an individual's
personal preferences. Aided by the diversity of tastes among
online crowds and their ability to understand others, we show
that crowdsourcing is an effective ondemand tool for
personalization. Unlike typical crowdsourcing approaches
that seek a ground truth, we present and evaluate two
crowdsourcing approaches designed to capture personal
preferences. The first, tastematching, identifies workers
with similar taste to the requester and uses their taste to infer
the requester’s taste. The second, tastegrokking, asks
workers to explicitly predict the requester’s taste based on
training examples. These techniques are evaluated on two
subjective tasks, personalized image recommendation and
tailored textual summaries. Tastematching and tastegrokking
both show improvement over the use of generic
workers, and have different benefits and drawbacks
depending on the complexity of the task and the variability of
the taste space.

Kuat Yessenov, Shubham Tulsiani, Aditya Menon, Robert C. Miller, Sumit Gulwani, Butler Lampson, and Adam Kalai
A Colorful Approach to Text Processing by Example
In Proceedings of the 26th annual ACM Symposium on User Interface Software and Technology (UIST) 2013.
abstract pdf
Text processing, tedious and errorprone even for
programmers, remains one of the most alluring targets of
Programming by Example. An examination of realworld
text processing tasks found on help forums reveals that
many such tasks, beyond simple string manipulation,
involve latent hierarchical structures.
We present STEPS, a programming system for processing
structured and semistructured text by example. STEPS
users create and manipulate hierarchical structure by
example. In a betweensubject user study on fourteen
computer scientists, STEPS compares favorably to
traditional programming.

Adam and Ehud Kalai
Cooperation in Two Person Games, Revisited
Quarterly Journal of Economics, 128(2):917966, 2013.
abstract pdf
For twoperson completeinformation strategic games with transferable utility, all ma
jor variablethreat bargaining and arbitration solutions coincide. This conuence of
solutions by luminaries such as Nash, Harsanyi, Raiffa, and Selten, is more than mere
coincidence.
Staying in the class of twoperson games with transferable unility, the present
paper presents a more complete theory that expands their solution. Specifically, it
presents: (1) a decomposition of a game into cooperative and competitive components,
(2) an intuitive and computable closedform formula for the solution, (3) an axiomatic
justification of the solution, and (4) a generalization of the solution to games with
private signals, along with an arbitration scheme that implements it. The objective is
to restart research on cooperative solutions to strategic games and their applications.

Aditya Menon, Omer Tamuz, Sumit Gulwani, Butler Lampson, and Adam Kalai
Textual Features for Programming by Example
In Proceedings of the 30th International Conference on Machine Learning (ICML), 2013.
abstract pdf
In Programming by Example, a system attempts to infer a program from input and output examples, generally by searching for a composition of certain base
functions. Performing a naive brute force search is infeasible for even mildly involved tasks. We note that the examples themselves often present clues as to which
functions to compose, and how to rank the resulting programs. In text processing,
which is our domain of interest, clues arise from simple textual features: for example, if parts of the input and output strings are permutations of one another, this
suggests that sorting may be useful. We describe a system thatlearns the reliability
of such clues, allowing for faster search and a principled ranking over programs.
Experiments on a prototype of this system show that this learning scheme facilitates efﬁcient inference on a range of text processing tasks.

Sivan Sabato and Adam Kalai
Feature MultiSelection among Subjective Features
In Proceedings of the 30th International Conference on Machine Learning (ICML), 2013.
abstract pdf
When dealing with subjective, noisy, or otherwise nebulous features, the "wisdom of crowds" suggests that one may benefit from multiple judgments of the same feature on the same object. We give theoreticallymotivated `feature multiselection' algorithms that choose, among a large set of candidate features, not only which features to judge but how many times to judge each one. We demonstrate the effectiveness of this approach for linear regression on a crowdsourced learning task of predicting people's height and weight from photos, using features such as 'gender' and 'estimated weight' as well as culturally fraught ones such as 'attractive'.

Omer Tamuz, Ce Liu, Serge Belongie, Ohad Shamir, and Adam Kalai
Adaptively Learning the Crowd Kernel
In Proceedings of the 28th International Conference on Machine Learning (ICML), 2011.
abstract pdf powerpoint
We introduce an algorithm that, given n objects,
learns a similarity matrix over all n^2
pairs, from crowdsourced data alone. The algorithm
samples responses to adaptively chosen
tripletbased relativesimilarity queries.
Each query has the form "is object a more
similar to b or to c?" and is chosen to be
maximally informative given the preceding
responses. The output is an embedding of
the objects into Euclidean space (like MDS);
we refer to this as the "crowd kernel." SVMs
reveal that the crowd kernel captures prominent
and subtle features across a number of
domains, such as " is striped" among neckties
and "vowel vs. consonant" among letters.

Adam Kalai and Ehud Kalai
Cooperation in Two Person Games, Revisited
ACM SIGEcom Exchanges, 10(1):1316.
abstract pdf
Luminaries such as Nash (1953), Rai
a (1953), and Selten (1960), studied cooperation in twoperson strategic games.
We point out that when players may make monetary side payments, i.e.,
bimatrix games with transferable utility (TU), all previous solutions coincide. This solution is justified by
simple axioms and an elementary decomposition of every such game into a (competitive)
zerosum game and a (cooperative) team game.

Brendan Juba, Adam Tauman Kalai, Sanjeev Khanna, and Madhu Sudan
Compression without a Common Prior: an InformationTheoretic Justification for Ambiguity in Natural Language
In Innovations in Computer Science (ICS), 2011.
abstract pdf
Compression is a fundamental goal of both human language and digital communication, yet natural
language is very different from compression schemes employed by modern computers. We partly explain this
difference using the fact that information theory generally assumes a common prior probability distribution
shared by the encoder and decoder, whereas human communication has to be robust to the fact that a speaker and
listener may have different prior beliefs about what a speaker may say. We model this informationtheoretically
using the following question: what type of compression scheme would be effective when the encoder and decoder
have (boundedly) different prior probability distributions. The resulting compression scheme resembles natural
language to a far greater extent than existing digital communication protocols. We also use information theory
to justify why ambiguity is necessary for the purpose of compression.

Nicole Immorlica, Adam Tauman Kalai, Brendan Lucier, Ankur Moitra, Andrew Postlewaite, and Moshe Tennenholtz
Dueling Algorithms
In Proceedings of the fortysecond annual ACM symposium on the Theory of Computing (STOC), 2011.
abstract pdf press
We revisit classic algorithmic search and optimization problems from the perspective of competition. Rather than a single optimizer minimizing expected cost, we consider a zerosum game in which an optimization problem is presented to two players, whose only goal is to outperform the opponent. Such games are typically exponentially large zerosum games, but they often have a rich combinatorial structure. We provide general techniques by which such structure can be leveraged to find minmaxoptimal and approximate minmaxoptimal strategies. We give examples of ranking, hiring, compression, and binary search duels, among others. We give bounds on how often one can beat the classic optimization algorithms in such duels.

Sham M. Kakade, Adam Kalai, Varun Kanade, and Ohad Shamir
Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression
In Advances in Neural Information Processing Systems 23 (NIPS), pp. 927935, 2011.
abstract pdf
Generalized Linear Models (GLMs) and Single Index Models (SIMs) provide
powerful generalizations of linear regression, where the target variable is assumed
to be a (possibly unknown) 1dimensional function of a linear predictor. In general,
these problems entail nonconvex estimation procedures, and, in practice,
iterative local search heuristics are often used. Kalai and Sastry (2009) provided
the first provably efficient method, the Isotron algorithm, for learning SIMs and
GLMs, under the assumption that the data is in fact generated under a GLM and
under certain monotonicity and Lipschitz (bounded slope) constraints. The Isotron
algorithm interleaves steps of perceptronlike updates with isotonic regression (fitting
a onedimensional nondecreasing function). However, to obtain provable
performance, the method requires a fresh sample every iteration. In this paper, we
provide algorithms for learning GLMs and SIMs, which are both computationally
and statistically efficient. We modify the isotonic regression step in Isotron to fit
a Lipschitz monotonic function, and also provide an efficient O(n log(n)) algorithm
for this step, improving upon the previous O(n^2) algorithm. We provide a
brief empirical study, demonstrating the feasibility of our algorithms in practice.

Adam Tauman Kalai,
Ehud Kalai
Cooperation and competition in strategic games with private information
June 2010; EC '10: Proceedings of the 11th ACM conference on Electronic commerce.
abstract
pdf (purchase)
We study strategic games in the tradition of cooperative game theory, where players may make binding agreements that involve side payments. We use a decomposition of strategic games into competitive and cooperative components to define and axiomatize a solution that unifies earlier solutions and generalizes to the case of incomplete information. In the latter case, we give incentive compatible mechanisms that achieve firstbest efficiency.

Adam Tauman Kalai, Ankur Moitra and Gregory Valiant
Efficiently Learning Mixtures of Two Gaussians
In Proceedings of the fortyfirst annual ACM symposium on the Theory of Computing (STOC), 2010.
abstract pdf
We consider the basic problem of learning the parameters of a mixture of two arbitrary mutlivariate Gaussians. Under provably minimal assumptions (namely that the mixing weights are bounded from 0 and the statistical distance between the two Gaussians is bounded from 0), we give a polynomialtime algorithm for extimating the mixture parameters that converges at an inverse polynomial rate. Previously, efficient algorithms were known for the special case where the Gaussians had little overlap, namely there statistical distance was close to 1.

Adam Tauman Kalai, Ehud Kalai,
Ehud Lehrer, and Dov Samet
A Commitment Folk Theorem
Games and Economic Behavior, 69(1): 127137, 2010.
abstract pdf
Real world players often increase their payoffs by voluntarily committing to play a fixed strategy, prior to the start of a strategic game. In fact, the players may further benefit from commitments that are conditional on the commitments of others.
This paper proposes a model of conditional commitments that unifies earlier
models while avoiding circularities that often arise in such models.
A commitment folk theorem shows that the potential of voluntary conditional
commitments is essentially unlimited. All feasible and individuallyrational
payoffs of a twoperson strategic game can be attained at the equilibria
of one (universal) commitment game that uses simple commitment devices.
The commitments are voluntary in the sense that each player maintains the
option of playing the game without commitment, as originally defined.

Adam Tauman Kalai, Michael Mitzenmacher, and Madhu Sudan
Tight Asymptotic Bounds for the Deletion Channel with Small Deletion Probabilities
In Proc. IEEE International Symposium on Information Theory (ISIT), 2010.
abstract pdf
In this paper, we consider the capacity C of the
binary deletion channel for the limiting case where the deletion probability p goes to 0. It is known that for any p < 1/2, the capacity satisfies C >= 1?H(p), where H is the standard binary entropy. We show that this lower bound is essentially tight in the limit, by providing an upper bound C <= 1?(1?o(1))H(p), where the o(1) term is understood to be vanishing as p goes to 0. Our proof utilizes a natural counting argument that should prove helpful in analyzing related channels.

Adam Tauman Kalai, Michal Feldman, and Moshe Tennenholtz
Playing Games without Observing Payoffs
In Innovations in Computer Science (ICS), 2010.
pdf
There is no abstract available for this paper.

Aaron Roth, Nina Balcan , Adam Kalai, and Yishay Mansour
On the Equilibria of Alternating Move Games
Proceedings of the Twentyfirst Annual ACMSIAM Symposium on Discrete Algorithms (SODA), 2010.
abstract pdf
We consider computational aspects of alternating move
games, repeated games in which players take actions at
alternating time steps rather than playing simultaneously.
We show that alternating move games are more
tractable than simultaneous move games: we give an
FPTAS for computing an εapproximate equilibrium of
an alternating move game with any number of players.
In contrast, it is known that for k >= 3 players, there
is no FPTAS for computing Nash equilibria of simultaneous
move repeated games unless P = PPAD. We
also consider equilibria in memoryless strategies, which
are guaranteed to exist in two player games. We show
that for the special case of k = 2 players, all but a negligible
fraction of games admit an equilibrium in pure
memoryless strategies that can be found in polynomial
time. Moreover, we give a PTAS to compute an ε
approximate equilibrium in pure memoryless strategies
in any 2 player game that admits an exact equilibrium
in pure memoryless strategies.

Christian Borgs, Jennifer Chayes, Adam Tauman Kalai, Azarakhsh Malekian, and Moshe Tennenholtz
A novel approach to propagating distrust
WINE 2010.
abstract pdf
Trust propagation is a fundamental topic of study in the theory and practice of ranking
and recommendation systems on networks. The Page Rank algorithm ranks web pages
by propagating trust throughout a network, and similar algorithms have been designed for
recommendation systems. How might one analogously propagate distrust as well? This is a
question of practical importance and mathematical intrigue. However, it has
proven challenging to model distrust propagation in a manner which is both logically consistent
and psychologically plausible. We propose a novel and simple extension of the Page Rank
equations, and argue that it naturally captures most types of distrust that are expressed in
such networks. We give an efficient algorithm for implementing the system and prove desirable
properties of the system.

Adam Tauman Kalai, Alex Samorodnitsky, and ShangHua Teng
Learning and Smoothed Analysis
Proceedings of the 50th Annual Symposium on Computer Science (FOCS), 2009
abstract pdf pdf (long version) slides (ppt)
We give a new model of learning motivated by
smoothed analysis (Spielman and Teng, 2001). In this model,
we analyze two new algorithms, for PAClearning DNFs and
agnostically learning decision trees, from random examples
drawn from a constantbounded product distributions. These
two problems had previously been solved using membership
queries (Jackson, 1995; Gopalan et al, 2005). Our analysis
demonstrates that the "heavy" Fourier coefficients of a DNF
suffice to recover the DNF. We also show that a structural
property of the Fourier spectrum of any boolean function over
"typical" product distributions.
In a second model, we consider a simple new distribution
over the boolean hypercube, one which is symmetric but is not
the uniform distribution, from which we can learn O(log n)
depth decision trees in polynomial time.

Adam Tauman Kalai and Varun Kanade
Potentialbased agnostic boosting
NIPS 09
abstract pdf
We prove strong noisetolerance properties of a potentialbased boosting algorithm,
similar to MadaBoost (Domingo and Watanabe, 2000) and SmoothBoost
(Servedio, 2003). Our analysis is in the agnostic framework of Kearns, Schapire
and Sellie (1994), giving polynomialtime guarantees in presence of arbitrary
noise. A remarkable feature of our algorithm is that it can be implemented without
reweighting examples, by randomly relabeling them instead. Our boosting
theorem gives, as easy corollaries, alternative derivations of two recent nontrivial
results in computational learning theory: agnostically learning decision trees
(Gopalan et al, 2008) and agnostically learning halfspaces (Kalai et al, 2005).
Experiments suggest that the algorithm performs similarly to MadaBoost.

Adam Tauman Kalai and Ravi Sastry
The Isotron Algorithm: HighDimensional Isotonic Regression
Proceedings of the 22nd Annual Conference on Learning Theory (COLT), 2009.
abstract pdf
The Perceptron algorithm elegantly solves binary classification problems that have a margin between positive and negative examples. Isotonic regression (fitting an arbitrary increasing function in one dimension) is also a natural problem with a simple solution. By combining the two, we get a new simple regression algorithm in high dimensions, with strong guarantees.

Varun Kanade, Adam Kalai, and Yishay Mansour
Reliable Agnostic Learning
Proceedings of the 22nd Annual Conference on Learning Theory (COLT), 2009.
abstract pdf
We give efficient algorithms for learning in a model of learning with onesided agnostic noise, such as a model of SPAM emails in which legit emails come from one set but SPAM emails may be distributed arbitrarily. The goal is to achieve an accuracy as high as possible while maintaining a nearzero rate of false positives. We show how to learn the class of DNFs using membership queries in this setting.

Parikshit Gopalan, Adam Tauman Kalai, and Adam R. Klivans
Agnostically Learning Decision Trees
In Proceedings of the thirtyninth annual ACM symposium on Theory of computing (STOC), pp. 527536, 2008
abstract pdf Slides (ppt)
We consider the problem of learning a decision tree in the presence of arbitrary noise. More formally, we are given blackbox access (a type of active learning) to an arbitrary binary function on n bits, and our output is a function whose accuracy is almost as good as that of the best sizes decision tree, where accuracy is measured over the uniform distribution on inputs.

Christian Borgs,
Jennifer Chayes,
Nicole Immorlica,
Adam Tauman Kalai,
Vahab Mirrokni, and
Christos Papadimitriou
The Myth of the Folk Theorem
In Proceedings of the thirtyninth annual ACM symposium on Theory of computing (STOC), pp. 365372, 2008
abstract pdf
The folk theorem for repeated games suggests that repeated games should by easier to play than oneshot games. This can be made formal by giving an efficient algorithm for finding Nash equilibria in any repeated game with two players. Perhaps surprisingly, we show that for three or more players, it is as hard to find Nash equilibria in repeated games as it is in oneshot games.

Adam Tauman Kalai, Yishay Mansour, and Elad Verbin
Agnostic Boosting and Parity Learning
To appear in Proceedings of the thirtyninth annual ACM symposium on Theory of computing (STOC), pp. 629638, 2008
abstract pdf slides (ppt)
We give the first boosting algorithm that is capable of handling arbitrary noise and boosting to optimal accuracy. We give an application to the problem of learning parity with arbitrary noise, over an arbitrary distribution.

Reid Andersen,
Christian Borgs,
Jennifer Chayes,
Uriel Feige,
Abraham Flaxman,
Adam Tauman Kalai,
Vahab Mirrokni,
Moshe Tennenholtz
TrustBased Recommendation Systems: an Axiomatic Approach
To appear in Proceedings of the 17th international conference on World Wide Web (WWW), 2008
abstract pdf
We cast the trustbased recommendation system problem as a voting problem on a graph with a partial +/ assignment, where each node receives a personalized recommendation. We consider natural axioms that lead to a unique algorithm, as well alternative axioms that lead to impossibility.

Adam Tauman Kalai
Learning Nested Halfspaces and Uphill Decision Trees
In Proceedings of 20th Annual Conference on Learning Theory (COLT), pp. 378392, 2007
abstract pdf
Predicting class probabilities and other realvalued quantities is often more useful than binary classification, but comparatively little work in PACstyle learning addresses this issue. We show that two rich classes of realvalued functions are learnable in the probabilisticconcept framework of Kearns and Schapire. The classes naturally generalize halfspaces, decision lists, and SVMs.

Sham Kakade,
Adam Tauman Kalai, and
Katrina Ligett
Playing Games with Approximation Algorithms
In Proceedings of the thirtyninth annual ACM symposium on Theory of computing (STOC), pp. 546555, 2007. Subsequently published in SIAM J. on Computing, 2009
abstract pdf
How should one play a large repeated twoperson game when you can only compute approximate best responses? How should one adaptively solve repeated optimization problems that are so complex that one can only approximately solve the offline (fullinformation) counterpart? This paper answers these two questions. The approach builds on a nice algorithm of Zinkevich (2003) and is very different from the previous work (for the exact cases) due to Hannan (1957) and Kalai and Vempala (2005, see below).

Eli BenSasson,
Adam Tauman Kalai, and
Ehud Kalai
An Approach to Bounded Rationality
In Advances in Neural Information Processing Systems 19 (NIPS), pp. 145152, 2007
abstract pdf
Many interactions in complex environments, e.g., chess, are affected by computational limitations. An extreme example is the factoring game, where the first player chooses a large number and sends it to the second player who then attempts to factor it. Ignoring computational considerations, the second player can factor any number and win, but with computational considerations the game seems to favor the first player. This wellknown issue in gametheory falls under the term bounded rationality, yet there is no general model of playing games with computational limitations. (Prior work focused mainly on the case of finite automata playing repeated games like prisoner's dilemma.) We propose an extremely simple model of a game with additional costs (computational or otherwise) for each strategy. We illustrate that this model fits nicely into existing concepts in game theory such as zerosum games, potential games, and submodular games, showing that natural learning dynamics continue to converge to equilibrium.

Ivona Bezakova,
Adam Tauman Kalai, and
Rahul Santhanam
Graph Model Selection using Maximum Likelihood
In Proceedings of the 23rd International Conference on Machine Learning (ICML), pp. 105112, 2006
abstract pdf
In recent years, a number of random graph models, such as preferential attachment, have been proposed as probabilistic models of large graphs. We suggest an objective method for ranking their performance on actual graphs. In particular, we look at the probability that a model assigns to a given graph, and design efficient MCMC algorithms for estimating these quantities.

Elad Hazan,
Adam Tauman Kalai,
Satyen Kale, and
Amit Agarwal
Logarithmic Regret Algorithms for Online Convex Optimization
In Proceedings of 19th Annual Conference on Learning Theory (COLT), pp. 499513, 2006.
abstract pdf
We present improved algorithms for the problem of online optimization, that use second derivatives, analogous to Newton's method.

Adam Tauman Kalai and
Santosh Vempala
Simulated Annealing for Convex Optimization
Math of OR, 31(2), pp. 253266, 2006
abstract pdf slides: 15min ppt slides: 50min ppt
Simulated annealing is a generalpurpose optimization technique that has proven useful in practice, but is notoriously difficult to analyze.
We show that it can be used for the problem of minimizing a linear function over a convex set, where the set is specified only by a membership oracle.
Moreover, its guaranteed convergence rate is faster than that of any known alternative algorithm.

Abraham Flaxman ,
Adam Tauman Kalai, and
Brendan McMahan
Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient
In Proceedings of the Sixteenth Annual ACMSIAM Symposium on Discrete Algorithms (SODA), pp. 385394, 2005
abstract pdf slides: 20min ppt slides: 50min ppt
In many problems, a decisionmaker has to make a sequence of decisions online, with limited feedback. For example, consider a company has to choose how much to spend in advertising through a number of channels each month. The feedback the company receives may be only the company's total profit that period. Such feedback is indirect and noisy  it may be more influenced by the economy than the company's choices during that period. To model such problems, we consider a fixed convex feasible set. On the ith period, the decision maker chooses a feasible point x_i, and receives some reward, f_i(x_i), where f_i is any bounded concave function. The only feedback the decision maker receives is this reward f_i(x_i)  no other information is revealed about f_i. We give an efficient algorithm that guarantees an average reward that approaches the best average reward achievable by a single feasible point, where the best is chosen with the benefit of hindsight. These guarantees hold for an arbitrary sequence of bounded concave functions.

Sham Kakade and
Adam Tauman Kalai.
From Batch to Transductive Online Learning
In Advances in Neural Information Processing Systems 18, 2006 (NIPS 05). (20min NIPS ppt)
abstract pdf
We consider the online transductive learning (BenDavid, Kushilevitz and Mansour, 1995) model where an arbitrary sequence of examples are presented to a learner, one at a time. The learner must predict each label, online, given only the labels of the previous examples and the future unlabeled examples. In contrast, more standard i.i.d. models of learning assume that examples are drawn independently and identically distributed from a fixed distribution. We give an algorithm that employs unlabeled data to convert any efficient learning algorithm in an i.i.d. setting to an efficient learning algorithm in the more difficult online transductive model.

Adam Tauman Kalai,
Adam Klivans,
Yishay Mansour, and
Rocco Servedio
Agnostically Learning Halfspaces
In Proceedings of the 46th Annual Symposium on the Foundations of Computer Science (FOCS), pp. 1120, 2005
abstract pdf pdf (long version) slides: 15min FOCS ppt
The agnostic model of learning (Kearns, Schapire, and Sellie, 1992) is a natural model of learning where one aims to do as well as the best function in a given class of functions. Unfortunately, there are many previous negative and impossibility results for agnostic learning, due to computational intractability. We give a new algorithm for agnostic learning, based on the celebrated learning algorithm of Linial, Mansour, and Nisan. We show that, under mild distributional assumptions, it learns to predict as well as the best halfspace, in n dimensions. The algorithm does not suffer from the ``curse of dimensionality  it runs in time polynomial rather than exponential in the dimension n.

Sanjoy Dasgupta,
Adam Tauman Kalai, and
Claire Monteleoni
Analysis of PerceptronBased Active Learning
In Proceedings of 18th Annual Conference on Learning Theory (COLT), 2005.
abstract pdf
In active learning, the set of unlabeled examples is known in advance, and the learner can choose which examples are labeled as training data. We give a simple and efficient active learning procedure based on the classic Perceptron algorithm. Under distributional assumptions, the number of points that must be labeled to achieve any error rate is logarithmic in the desired error rate.

Adam Tauman Kalai and
Rocco Servedio
Boosting in the Presence of Noise
Journal of Computer and System Sciences 71(3): 266290, 2005
abstract pdf pdf (STOC version)
Boosting is a technique that attempts to improve the accuracy of any learning algorithm. It has both strong theoretical guarantees as well as overwhelming empirical success. One oftencited weakness of boosting is its inability to deal with noisy data, even when the noise is chosen completely randomly and independently. Following work of Kearns, Mansour, and McAllester, we show that a simple divideandconquer boosting algorithm can boost optimally in the presence of noise.

Adam Tauman Kalai and
Santosh Vempala
Efficient Algorithms for Online Optimization
Journal of Computer and System Sciences 71(3): 291307, 2005
abstract pdf pdf (COLT version)
In online optimization, one wants to solve a sequence of combinatorial optimization problems in the face of uncertainty. For example, each day one wants to choose a path to drive to work, and only afterwards, one finds out the times on the various roads. Suppose that one can efficiently solve the oneshot version of the optimization problem, e.g., one can easily find the shortest path in a graph with known times. We give an efficient procedure for adaptively choosing solutions to a sequence of such problems, where the average quality of the solution approaches that of the best single solution. This holds for an arbitrary sequence of problems, yet guarantees are given with respect to the best solution, chosen with the benefit of hindsight. Our algorithm is as efficient as the best offline solution, i.e., the fastest algorithm to solve the problem if the entire sequence were known in advance.

Adam Tauman Kalai
Learning Monotonic Linear Functions
In Proceedings of 17th Annual Conference on Learning Theory, pp. 487501
abstract pdf pdf (long version) slides: 50min (ppt)
Generalized Additive Models (Hastie and Tibshirani) are a broad generalization of both linear and logistic regression. We give the first computationally efficient algorithm that provably learns this class of models. It is efficient in two sense: its runtime is polynomial in the size of the training data, and its error approaches optimal at a rate that is inverse polynomial.

Avrim Blum,
Suchi Chawla, and
Adam Tauman Kalai
Static Optimality and Dynamic Search Optimality in Lists and Trees
Algorithmica 36:3, pp. 249260, 2003.
abstract pdf SODA version (ps file)
Splay Trees are an efficient algorithm for maintaining a dynamic binary search tree, popular both in theory and practice. They are guaranteed to be only a constant factor worse than the best possible static binary search tree. We give algorithms that are only (1+eps) worse than the best fixed binary search tree, and similar results for lists.

Avrim Blum,
Adam Tauman Kalai, and
Hal Wasserman
NoiseTolerant Learning, the Parity Problem, and the Statistical Query Model
JACM 50(4):506519, 2003.
abstract pdf STOC version (pdf)
The parity with noise learning problem is one of the most beautiful problems in computer science that is believed to be hard. It is not only believed to be hard on worstcase instances but also on random instances. An efficient algorithm for this problem would find applications in many fields, such as errorcorrecting codes. We give the fastestknown algorithm. While it is not truly efficient, it is the first solution that is faster than brute force.
 Adam Tauman Kalai
Generating Random Factored Numbers, Easily
Journal of Cryptology 16(4):287289, 2003.
abstract pdf SODA version (pdf)
Our goal is to generate a uniformly random number between 1 and N, along with its prime factorization. Since there are no known efficient factoring algorithms, one cannot simply generate a random number and then factor it. Eric Bach's awardwinning dissertation introduces and solves this problem. We present a fourline solution that is still efficient. (See also this web page).
 Adam Tauman Kalai
Efficient PatternMatching with Don't Cares
In Proceedings of the Thirteenth Annual ACMSIAM Symposium on Discrete Algorithms (SODA), pp. 655656, 2002.
abstract ps file
We give a randomized algorithm for the string matching with don't cares problem, the standard string matching problem with wildcards. We present a fourline solution that is slightly faster and much simpler than previous algorithms.
 Adam Tauman Kalai
Better computers for better people
Ph.D. thesis, technical report CMUCS01132, 2001
abstract pdf
On the surface, the three online machine learning problems analyzed in this thesis may seem unrelated.
The first is an online investment strategy introduced by Tom Cover. We begin with a simple analysis that
extends to the case of fixedpercentage transaction costs. We then describe an efficient implementation that
runs in time polynomial in the number of stocks. The second problem is kfold cross validation, a popular
technique in machine learning for estimating the error of a learned hypothesis. We show that this is a valid
technique by comparing it to the holdout estimate. Finally, we discuss work towards a dynamicallyoptimal
adaptive binary search tree algorithm.
 Adam Tauman Kalai and
Santosh Vempala
Efficient Algorithms for Universal Portfolios
Journal of Machine Learning Research 3(3):423440, 2002
abstract pdf FOCS version (ps file)
The Universal Portfolio is an investment strategy introduced by Thomas Cover. Previously, all known implementations required computation exponential in the number of stocks. In this paper, we use random walks for an implementation that is polynomial in the number of stocks.
 Adam Tauman Kalai and
Ehud Kalai
Strategic Polarization
Journal of Mathematical Psychology, 45:4, pp. 656663, 2001.
abstract ps file
An example of polarization is a marriage in which one spouse gives generously to charity while the other donates nothing. This may misrepresent what is, in actuality, a small discrepancy in preferences. It may be that the donating spouse would like to see 10% of their combined income go to charity each year, while the apparently frugal spouse would like to see 9% donated. A simple gametheoretic analysis suggests that the spouses will end up donating 10% and 0%, respectively. This paper gives a strategic (gametheoretic) analysis of polarization.

Avrim Blum,
Carl Burch, and
Adam Tauman Kalai
Finely Competitive Paging
In Proceedings of the 40th Annual Symposium on the Foundations of Computer Science (FOCS), pp. 450458, 1999
abstract ps file
In the online paging problem, one has a fixedsize cache. Each time a page is requested that is not in the cache, it has to be brought in at a cost of 1 and another page has to be evicted from the cache. We present an algorithm that has the same worstcase performance as the previous algorithms, but for a large and important class of request sequences, is superior.

HeungYeung Shum,
Adam Tauman Kalai, and
Steve Seitz
Omnivergent Stereo
Journal of Computer Vision 48:3, pp 159172, 2002. (also at ICCV 1999)
abstract pdf
We describe the design of an optimal camera for 3D reconstruction. In our model, a camera may capture a fixed number of pixels (light rays) from anywhere inside the circular frame of the camera. Afterwards, points outside the camera must be reconstructed as accurately as possible. We show that capturing tangent rays (in both directions) gives optimal uniform reconstructions.

Avrim Blum,
Adam Tauman Kalai, and
John Langford
Beating the Holdout: Bounds for KFold and Progressive CrossValidation
In Proceedings of the 10th Annual Conference on Computational Learning Theory (COLT), 1999
abstract ps file
KFold crossvalidation is a popular technique in machine learning for estimating the performance of a learned hypothesis on a data set. We provide the first theoretical justification for this method that shows that it is, on average, more accurate than a heldout test set of comparable size.

Stan Chen,
Adam Tauman Kalai,
Avrim Blum, and
Roni Rosenfeld
Online Algorithms for Combining Language Models
In Proceedings of the International Conference on Accoustics, Speech, and Signal Processing, 1999.
abstract ps file
We show how Cover's Universal Portfolios can be applied to the field of language modeling. The goal of a language model is to accurately predict the next word in a sequence of words, with applications to speech recognition and data compression. In this setting, the universal guarantees are very striking. We support this with experiments on several different corpora.

Avrim Blum and
Adam Tauman Kalai
Universal Portfolios With and Without Transaction Costs
Machine Learning 35:3, pp 193205, 1999
abstract ps file 15min COLT slides (html)
We present a much simpler analysis of Thomas Cover's Universal Portfolios. This analysis easily extends to the case of transaction costs.
 Adam Tauman Kalai and
Mel Siegel
Improved Rendering of Parallax Panoramagrams for a TimeMultiplexed Autostereoscopic Display
In Stereoscopic Displays and Applications IX: Proceedings of SPIE 3295, 1998
abstract ps file
We describe a new kind of 3D display that is viewable without glasses.

Avrim Blum and
Adam Tauman Kalai
A Note on Learning from MultipleInstance Examples
Machine Learning 30:1, pp. 2330, 1998
abstract ps file
In the multipleinstance learning setting introduced by Dietterich et al, the example data consist of ntuples of points. An ntuple is labeled positive if any of the n constituent points are positive. We show that this problem can be reduced to that of learning in the presence of noise.