Our research
##### Content type
+
+
Events (415)

Groups (143)
+
News (2637)

People (732)

Projects (1065)
+
Publications (12152)
+
Videos (5344)
##### Research areas
Algorithms and theory47205 (294)
Communication and collaboration47188 (190)
Computational linguistics47189 (193)
Computational sciences47190 (199)
Computer systems and networking47191 (702)
Computer vision208594 (881)
Data mining and data management208595 (78)
Economics and computation47192 (96)
Education47193 (81)
Gaming47194 (71)
Graphics and multimedia47195 (208)
Hardware and devices47196 (200)
Health and well-being47197 (81)
Human-computer interaction47198 (803)
Machine learning and intelligence47200 (781)
Mobile computing208596 (37)
Quantum computing208597 (20)
Search, information retrieval, and knowledge management47199 (626)
Security and privacy47202 (275)
Social media208598 (27)
Social sciences47203 (247)
Software development, programming principles, tools, and languages47204 (569)
Speech recognition, synthesis, and dialog systems208599 (86)
Technology for emerging markets208600 (27)
1–25 of 294
Sort
Show 25 | 50 | 100
1234567Next

We introduce symbolic tree automata as a generalization of finite tree automata with a parametric alphabet over any given background theory. We show that symbolic tree automata are closed under Boolean operations, and that the operations are effectively uniform in the given alphabet theory. This generalizes the corresponding classical properties known for finite tree automata.

##### Publication details
Date: 1 March 2015
Type: Article
Publisher: Elsevier
Number: 3

We present several quantum algorithms for performing nearest-neighbor learning. At the core of our algorithms are fast and coherent quantum methods for computing distance metrics such as the inner product and Euclidean distance. We prove upper bounds on the number of queries to the input data required to compute these metrics. In the worst case, our quantum algorithms lead to polynomial reductions in query complexity relative to the corresponding classical algorithm. In certain cases, we show...

##### Publication details
Date: 1 March 2015
Type: Article
Publisher: Rinton Press
Number: 3&4

String-manipulating programs are an important class of programs with applications in malware detection, graphics, input sanitization for Web security, and large-scale HTML processing. This paper extends prior work on BEK, an expressive domain-specific language for writing string-manipulating programs, with algorithmic insights that make BEK both analyzable and data-parallel. By analyzable we mean that unlike most general purpose programming languages, many algebraic properties of a BEK...

##### Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

This paper proposes a new approach for discriminative clustering. The intuition is, for a good clustering, one should be able to learn a classifier from the clustering labels with high generalization accuracy. Thus we define a novel metric to evaluate the quality of a clustering labeling, named Minimum Separation Probability (MSP), which is a lower bound of the generalization accuracy of a classifier learnt from the clustering labeling. We take MSP as the objective to maximize and propose our...

##### Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence

In this paper, we investigate an approach to program synthesis that is based on crowd-sourcing. With the help of crowd-sourcing, we aim to capture the “wisdom of the crowds” to find good if not perfect solutions to inherently tricky programming tasks, which elude even expert developers and lack an easy-to-formalize specification.

We propose an approach we call program boosting, which involves crowd-sourcing imperfect solutions to a difficult programming problem from developers and then...

##### Publication details
Date: 1 January 2015
Type: Proceedings
Publisher: ACM – Association for Computing Machinery
##### Publication details
Date: 1 January 2015
Type: Article
Publisher: Nature Publishing Group
##### Publication details
Date: 1 January 2015
Type: Article
Publisher: Nature Publishing Group

Human computation or crowdsourcing involves joint inference of the ground-truth-answers and the worker abilities by optimizing an objective function, for instance, by maximizing the data likelihood based on an assumed underlying model. A variety of methods have been proposed in the literature to address this inference problem. As far as we know, none of the objective functions in existing methods is convex. In machine learning and applied statistics, a convex function such as the objective function of...

##### Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence

We consider distributed convex optimization problems originated from sample average approximation of stochastic optimization, or empirical risk minimization in machine learning. We assume that each machine in the distributed computing system has access to a local empirical loss function, constructed with i.i.d. data sampled from a common distribution. We propose a communication-efficient distributed algorithm to minimize the overall empirical loss, which is the average of the local empirical losses. The...

##### Publication details
Date: 1 January 2015
Type: Technical report
Number: MSR-TR-2015-1

We introduce the online stochastic Convex Programming (CP) problem, a very general version of stochastic online problems which allows arbitrary concave objectives and convex feasibility constraints. Many well-studied problems like online stochastic packing and covering, online stochastic matching with concave returns, etc. form a special case of online stochastic CP. We present fast algorithms for these problems, which achieve near-optimal regret guarantees for both the i.i.d. and the...

##### Publication details
Date: 1 January 2015
Type: Inproceeding
Publisher: SIAM – Society for Industrial and Applied Mathematics
##### Publication details
Date: 1 December 2014
Type: Article
Publisher: Frontiers

The use of M-estimators in generalized linear regression models in high dimensional settings requires risk minimization with hard L0 constraints. Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in very restrictive settings which do not hold in high dimensional statistical models. In this...

##### Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: Neural Information Processing Systems

Modern applications in sensitive domains such as biometrics and medicine frequently require the use of non-decomposable loss functions such as precision@k, F-measure etc. Compared to point loss functions such as hinge-loss, these offer much more fine grained control over prediction, but at the same time present novel challenges in terms of algorithm design and analysis. In this work we initiate a study of online learning techniques for such non-decomposable loss functions with an aim to enable...

##### Publication details
Date: 1 December 2014
Type: Inproceeding
Publisher: Neural Information Processing Systems

We consider the problem of minimizing the sum of two convex functions: one is the average of a large number of smooth component functions, and the other is a general convex function that admits a simple proximal mapping. We assume the whole objective function is strongly convex. Such problems often arise in machine learning, known as regularized empirical risk minimization. We propose and analyze a new proximal stochastic gradient method, which uses a multistage scheme to progressively reduce the...

##### Publication details
Date: 1 December 2014
Type: Article
Publisher: SIAM – Society for Industrial and Applied Mathematics
Number: 4

Propagation of contagion through networks is a fundamental process. It is used to model the spread of information, influence, or a viral infection. Diffusion patterns can be specified by a probabilistic model, such as Independent Cascade (IC), or captured by a set of representative traces.

Basic computational problems in the study of diffusion are influence queries (determining the potency of a specified seed set of nodes) and Influence Maximization (identifying the...

##### Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

We give message-optimal randomized algorithms for two fundamental
distributed assignment tasks, leader election and renaming.
In leader election, all n participants compete for a single item, whereas in renaming the n participants
each compete for one of n distinct items, or names.
The setting is the classic asynchronous message-passing model, where the n
processors communicate over unreliable channels, while timing and t < n / 2 processor crashes are controlled
by a...

##### Publication details
Date: 1 November 2014
Type: Technical report
Number: MSR-TR-2014-18

In several data center applications, computational tasks are processed in a distributed system, such that each task requires a set of distinct data inputs. To ensure scalability, the assignment of tasks to machines should guarantee that tasks requiring similar data inputs are collocated, and that the load is balanced across machines.

We formulate the problem of load balancing tasks with overlapping requirements, where given an assignment of tasks to machines, the load of a machine...

##### Publication details
Date: 1 November 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-151

In this work, we consider the following random process, motivated by the analysis of lock-free concurrent algorithms under stochastic schedulers. In each step, a new ball is allocated into one of $n$ bins, according to a distribution $\vect{p} = (p_1, p_2, \ldots, p_n)$, where each ball goes to bin $i$ with probability $p_i$. When some bin first reaches a set threshold of balls, it registers a \emph{win}, and resets its ball count to $1$. At the same time, bins whose ball count was close to the...

##### Publication details
Date: 1 November 2014
Type: Technical report
Number: MSR-TR-2014-153

In this paper we show that the alpha-beta algorithm and its successor MT-SSS*, as two classic minimax search algorithms, can be implemented as rollout algorithms, a generic algorithmic paradigm widely used in many domains. Specifically, we define a family of rollout algorithms, in which the rollout policy is restricted to select successor nodes only from a certain subset of the children list. We show that any rollout policy in this family (either deterministic or randomized) guarantees...

##### Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence

Closeness centrality, first considered by Bavelas (1948), is an importance measure of a node in a network which is based on the distances from the node to all other nodes. The classic definition, proposed by Bavelas (1950), Beauchamp (1965), and Sabidussi (1966), is (the inverse of) the average distance to all other nodes.

We propose the first highly scalable (near linear-time processing and linear space overhead) algorithm for estimating, within a small relative error, the classic closeness...

##### Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

Social networks serve as important platforms for users to express, exchange and form opinions on various topics. Several opinion dynamics models have been proposed to characterize how a user iteratively updates her expressed opinion based on her innate opinion and the opinion of her neighbors. The extent to how much a user is influenced by her neighboring opinions, as opposed to her own innate opinion, is governed by a measure of her “conformity’ parameter. Characterizing this degree of conformity for...

##### Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: Proc. Intl. Conference on Social Networks (COSN)

Most discovery systems for silent failures work in two phases: a continuous monitoring phase that detects presence of failures through probe packets and a localization phase that pinpoints the faultyelement(s). We focus on the monitoring phase, where the goal is to balance the probing overhead with the cost associated with longer failure detection times.

We formulate a general model for the underlying fundamental subset-test scheduling problem. We unify the treatment of schedulers and cost...

##### Publication details
Date: 1 October 2014
Type: Technical report
Publisher: Elsevier
Number: MSR-TR-2014-113

We consider the problem of consensus in the challenging classic model. In this model, the adversary is adaptive; it can choose which processors crash at any point during the course of the algorithm. Further, communication is via asynchronous message passing: there is no known upper bound on the time to send a message from one processor to another, and all messages and coin flips are seen by the adversary.

We describe a new randomized consensus protocol for this model with expected message...

##### Publication details
Date: 1 October 2014
Type: Technical report
Number: MSR-TR-2014-15

A discussion group is a repeated, synchronized conversation organized around a specific topic. Groups are extremely valuable to the attendees, creating a sense of community among like-minded users. While groups may involve many users, there are many outside the group that would benefit from participation. However, finding the right group is not easy given their quantity and given topic overlap. We study the following problem: given a search query, find a good ranking of discussion groups. We...

##### Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
##### Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: European Association for Theoretical Computer Science
1–25 of 294
Sort
Show 25 | 50 | 100
1234567Next
> Our research