Events (457)

Groups (153)
+
News (2791)

People (722)

Projects (1120)
+
Publications (12756)
+
Videos (5899)
##### Research areas
Algorithms and theory47205 (355)
Communication and collaboration47188 (219)
Computational linguistics47189 (251)
Computational sciences47190 (235)
Computer systems and networking47191 (801)
Computer vision208594 (920)
Data mining and data management208595 (136)
Economics and computation47192 (107)
Education47193 (89)
Gaming47194 (81)
Graphics and multimedia47195 (239)
Hardware and devices47196 (220)
Health and well-being47197 (93)
Human-computer interaction47198 (931)
Machine learning and intelligence47200 (955)
Mobile computing208596 (73)
Quantum computing208597 (37)
Search, information retrieval, and knowledge management47199 (707)
Security and privacy47202 (327)
Social media208598 (64)
Social sciences47203 (273)
Software development, programming principles, tools, and languages47204 (638)
Speech recognition, synthesis, and dialog systems208599 (152)
Technology for emerging markets208600 (58)
##### Publication details
Date: 1 January 2016
Type: Article

In many applications, the structure of data can be represented by a hyper-graph, where the data items are vertices, and the associations among items are represented by hyper-edges. Equivalently, we are given as input a bipartite graph with two kinds of vertices: items, and associations (which we refer to as topics). We consider the problem of partitioning the set of items into a given number of partitions, such that the maximum number of topics covered by a partition is minimized.

This is a...

##### Publication details
Date: 1 December 2015
Type: Inproceeding
##### Publication details
Date: 1 November 2015
Type: Article
Publisher: Nature Publishing Group

We consider the problem of minimizing the sum of two convex functions: one is smooth and given by a gradient oracle, and the other is separable over blocks of coordinates and has a simple known structure over each block. We develop an accelerated randomized proximal coordinate gradient (APCG) method for minimizing such convex composite functions. For strongly convex functions, our method achieves faster linear convergence rates than existing randomized proximal coordinate gradient methods. Without...

##### Publication details
Date: 1 November 2015
Type: Article
Publisher: SIAM – Society for Industrial and Applied Mathematics

In modern web-scale applications that collect data from different sources, entity conflation is a challenging task due to various data quality issues. In this paper, we propose a robust and distributed framework to perform conflation on noisy data in the Microsoft Academic Service dataset. Our framework contains two major components. In the offline component, we train a GBDT model to determine whether two papers from different sources should be conflated to the same paper entity. In the online...

##### Publication details
Date: 1 October 2015
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers

We propose an approach for approximating the Jaccard similarity of two streams, for domains where this similarity is known to be high. Our method is based on a reduction from Jaccard similarity to F_2 norm estimation, for which there exists a sketch that is efficient in terms of both size and compute time, which we augment by a sampling technique. Our approach offers an improvement in the fingerprint size that is quadratic in the degree of similarity between the streams. Further, computing our...

##### Publication details
Date: 1 October 2015
Type: Article
Publisher: Elsevier

The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field...

##### Publication details
Date: 1 September 2015
Type: Article
Publisher: ACM – Association for Computing Machinery

High memory contention is generally agreed to be a worst-case scenario for concurrent data structures. There has been a significant amount of research effort spent investigating designs which minimize contention, and several programming techniques have been proposed to mitigate its effects. However, there are currently few architectural mechanisms to allow scaling contended data structures at high thread counts.

In this paper, we investigate hardware support for scalable contended data...

##### Publication details
Date: 1 September 2015
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2015-71

Mobile crowdsourcing is a powerful tool for collecting data of various types. The primary bottleneck in such systems is the high burden placed on the user who must manually collect sensor data or respond in-situ to simple queries (e.g., experience sampling studies). In this work, we present Compressive CrowdSensing (CCS) – a framework that enables compressive sensing techniques to be applied to mobile crowdsourcing scenarios. CCS enables each user to provide significantly reduced amounts of manually...

##### Publication details
Date: 1 September 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

In natural language understanding (NLU), a user utterance can be labeled differently depending on the domain or application (e.g., weather vs. calendar). Standard domain adaptation techniques are not directly applicable to take advantage of the existing annotations because they assume that the label set is invariant. We propose a solution based on label embeddings induced from canonical correlation analysis (CCA) that reduces the problem to a standard domain adaptation task and allows use of a number of...

##### Publication details
Date: 29 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics

In this paper, we introduce the task of selecting compact lexicon from large, noisy gazetteers.
This scenario arises often in practice, in particular spoken language understanding (SLU).
We propose a simple and effective solution based on matrix decomposition techniques:
canonical correlation analysis (CCA) and rank-revealing QR (RRQR) factorization. CCA is first used to derive low-dimensional gazetteer embeddings from domain-specific search logs. Then RRQR is used to find a subset of...

##### Publication details
Date: 27 August 2015
Type: Proceedings
Publisher: ACL – Association for Computational Linguistics

We introduce cost allocation in the cloud, a problem of accounting costs produced by virtual machine workload in the data center to each player in the cloud. Player here can be the size of virtual machine type or the count of virtual machine in the user decided virtual machine packets. We draw this problem by analyzing the cost of real workload in Microsoft Azure. To our best knowledge, the cost allocation in the cloud has never been studied before. Based on the fact that the players in the cloud form...

##### Publication details
Date: 1 August 2015
Type: Inproceeding
Publisher: SoCC 2015 (Poster)

We consider revenue negotiation problems in iterative settings. In our model, a group of agents has some initial resources, used in order to generate revenue. Agents must agree on some way of dividing resources, but there’s a twist. At every time-step, the revenue shares received at time t are agent resources at time t + 1, and the game is repeated. The key issue here is that the way resources are shared has a dramatic effect on longterm social welfare, so in order to maximize...

##### Publication details
Date: 1 August 2015
Type: Article

The problem of electing a leader from among n contenders is one of the fundamental questions in distributed computing. In its simplest formulation, the task is as follows: given n processors, all participants must eventually return a win or lose indication, such that a single contender may win. Despite a considerable amount of work on leader election, the following question is still open: can we elect a leader in an asynchronous fault-prone system faster than...

##### Publication details
Date: 1 July 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

Symbolic finite automata and transducers augment classic automata and transducers with symbolic alphabets represented as parametric theories. This extension enables to succinctly represent large and potentially infinite alphabets while preserving closure and decidability properties. Extended symbolic finite automata and transducers further extend these objects by allowing transitions to read consecutive input elements in a single step. In this paper we study the properties of these models. In contrast...

##### Publication details
Date: 1 July 2015
Type: Article
Publisher: Springer

The establishment of homeostasis among cell growth, differentiation, and apoptosis is of key importance for organogenesis. Stem cells respond to temporally and spatially regulated signals by switching from mitotic proliferation to asymmetric cell division and differentiation. Executable computer models of signaling pathways can accurately reproduce a wide range of biological phenomena by reducing detailed chemical kinetics to a discrete, finite form. Moreover, coordinated cell movements and physical...

##### Publication details
Date: 1 July 2015
Type: Article

In this work, we consider the following random process, motivated by the analysis of lock-free concurrent algorithms under high memory contention. In each round, a new scheduling step is allocated to one of $n$ threads, according to a distribution $\vect{p} = (p_1, p_2, \ldots, p_n)$, where thread $i$ is scheduled with probability $p_i$. When some thread first reaches a set threshold of executed steps, it registers a \emph{win}, completing its current operation, and resets its step count to $1$. At...

##### Publication details
Date: 1 July 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

Population protocols are networks of finite-state agents, interacting randomly, and updating their state using simple rules. Despite their extreme simplicity, these systems have been shown to cooperatively perform complex computational tasks, such as simulating register machines to compute standard arithmetic functions. The election of a unique leader agent is a key requirement in such computational constructions. Yet, the fastest currently known population protocol for electing a leader only...

##### Publication details
Date: 1 July 2015
Type: Inproceeding
Publisher: Springer
##### Publication details
Date: 1 July 2015
Type: Proceedings
Publisher: Springer

Population protocols, roughly defined as systems consisting of large numbers of simple identical agents, interacting at random and updating their state following simple rules, are an important research topic at the intersection of distributed computing and biology. One of the fundamental tasks that a population protocol may solve is majority: each node starts in one of two states; the goal is for all nodes to reach a correct consensus on which of the two states was initially the majority....

##### Publication details
Date: 1 July 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

Modern software applications and services operate nowadays on top of large clusters and datacenters. To reduce the underlying infrastructure cost and increase utilization, different services share the same physical resources (e.g., CPU, bandwidth, I/O, memory). Consequently, the cluster provider often has to decide in real-time how to allocate resources in overbooked systems, taking into account the different characteristics and requirements of users. In this paper, we consider an important problem...

##### Publication details
Date: 1 June 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

Shpilka and Wigderson [SW99] had posed the problem of proving exponential lower bounds for (nonhomogeneous) depth three arithmetic circuits with bounded bottom fanin over a field F of characteristic zero. We resolve this problem by proving a NOmega(d/t) lower bound for (nonhomogeneous) depth three arithmetic circuits with bottom fanin at most t computing an explicit N-variate polynomial of degree d over F.

##### Publication details
Date: 1 June 2015
Type: Inproceeding
Publisher: LIPICS

We study the problem of placing streaming queries into servers. Unlike previous work, we focus on queries that consume events of relative low rates, each computed in a single server (i.e. no scaling out per query). However, we need to place a very large and dynamic number of queries in relatively few servers. Our focus is motivated by the need to support a platform for hosting end-user streaming queries that may come from a variety of applications, such as the Cortana personal assistant.

The...

##### Publication details
Date: 1 June 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery

We study the journey planning problem in public transit networks. Developing efficient preprocessing-based speedup techniques for this problem has been challenging: current approaches either require massive preprocessing effort or provide limited speedups. Leveraging recent advances in Hub Labeling, the fastest algorithm for road networks, we revisit the well-known time-expanded model for public transit. Exploiting domain-specific properties, we provide simple and efficient algorithms for the earliest...

##### Publication details
Date: 1 June 2015
Type: Inproceeding
Publisher: Springer
##### Publication details
Date: 1 June 2015
Type: Article
Number: 4
