MWT is a toolbox of machine learning technology for principled and efficient experimentation, plausibly applicable to most Microsoft services that interact with customers.
The goal of the project is to advance our knowledge about a fundamental distributed decision-making problem that is referred by different names, such as "consensus", "majority", "quantized averaging", and is closely related to the voter model, quantile computation, selection problem, local interaction, and coordination games. The consensus problem is of interest in different applications including general distribution computation, population protocols, biological systems, and social networks.
This is a project looking into design and evaluation of efficient and deployable algorithms for assignment of complex workloads to resources in modern cloud service platforms.
We present Sparse Reflections Analysis (SRA), an algorithm for removing multipath interference from Time of Flight sensors. SRA allows for very general forms of multipath, including interference with three or more paths, diffuse multipath resulting from Lambertian surfaces, and combinations thereof. SRA removes this general multipath with robust techniques based on L1 optimization. Further, due to a novel dimension reduction, we present a very fast version of SRA, which can run at frame rate.
Labs: ATL Israel
This project aims to understand and characterize user behavior in online social networks. Specifically, we posit, analyze, and validate different models of formation and evolution of opinions in social networks. We characterize our models both in terms of stability and goodness in terms of an approximation ratio w.r.t. to a social optimum (e.g., Price of Anarchy, Price of Stability etc).
We publish a subset of the data from the paper "Discriminative Ferns Ensemble for Hand Pose Recognition".
We work on questions motivated by machine learning, in particular from the theoretical and computational perspectives. Our goals are to mathematically understand the effectiveness of existing learning algorithms and to design new learning algorithms. We combine expertise from diverse fields such as algorithms and complexity, statistics, and convex geometry.
MSR has a strong group of scientists working on algorithm design, analysis, and experimental evaluation, as well as researchers in related areas (e.g., coding theory), but no formal algorithms group. The Virtual Algorithms Center (VIRAL) brings these individuals together. The goals of the center is to enhance collaboration between algorithms researchers and the rest of MSR, provide internal consulting, and give an external view of the algorithms research at MSR.
Tabular is a Excel add-in that brings the power of model based machine learning to data enthusiasts. It allows the user to write a simple model that explains their data and perform Bayesian inference. Tabular is built on top of Infer.NET.
Face In The Crowd examines the social impact of crowdsourcing platforms—cloud-based computational systems that allow the outsourcing of work through open requests—and how they might shape the future of work.
Labs: New England
This is an umbrella project for our activity in machine learning with exploration-exploitation tradeoff. Most of us are at MSR-NYC.
We are working toward a theoretical foundation of developing large-scale human-machine systems that combine the intelligence of human and the computing power of machine to solve the problems that are difficult to solve by either human or machine alone.
This project focuses on advancing the state-of-the-art in language processing with recurrent neural networks. We are currently applying these to language modeling, machine translation, speech recognition, language understanding and meaning representation. A special interest in is adding side-channels of information as input, to model phenomena which are not easily handled in other frameworks.
It has become increasingly difficult to process information from the vast amount of data collected and stored in different computer systems all around us. The main challenge is how to design distributed and scalable algorithms and systems to extract useful information from this large quantity of data. The goal of this project is to address various questions related to distributed and scalable processing of big data.
The performance of the elliptic curve method (ECM) for integer factorizationplays an important role in the security assessment of RSA-based protocols as a cofactorization tool inside the number field sieve. This webpage gives addition-subtracting chains to optimize Edwards ECM in terms of both performance and memory requirements. See for more details the "ECM at Work" paper.
Rainbow Tangle is a game on a sequence of permutations. The purpose of the game is to create as straight routes of the permutation particles as possible.
We work on novel sampling techniques to design faster algorithms for solving problems on large data. Algorithmic applications of our methods include dimensionality reduction, clustering, and linear algebraic problems on large matrices.
In the last 25 years, Elliptic Curve Cryptography (ECC) has become a mainstream primitive for cryptographic protocols and applications. ECC has been standardized for use in key exchange and digital signatures. This project focuses on efficient generation of parameters and implementation of ECC and pairing-based crypto primitives, across architectures and platforms.
LIQUi|> (pronounced "liquid") is a software architecture and toolsuite for quantum computing. It is currently being developed and will include programming languages, optimization and scheduling algorithms, and quantum simulators. Ultimately, LIQUi|> will be used to translate a quantum algorithm written in the form of a high-level program into the low-level machine instructions for a quantum device. LIQUi|> is being developed by the Quantum Architectures and Computation Group (QuArC).
We introduced a game-theoretic framework for crowdsourcing systems.
Counterdog is an automated theorem prover for a counterfactual meta-logic on propositional Datalog. The prover is complete for the logic, and can prove (or disprove) counterfactual statements such as: "if 'p' is false in a Datalog program, but would be true if it contained 'a:-b', then 'b' is true in the program.". Counterdog is useful for reasoning about Datalog-based trust management languages. The theory and implementation were developed by Moritz Y. Becker and Nik Sultana.
In recent years, computational challenges have become more and more important to infer biologically relevant information from the vast amount of experimental data available to systems biologists.
Labs: New England
What is the link between how problems are logically described in a formal language, and how they are solved by algorithms?
Time Series Foundation (TSF) is a .NET toolset for exploring new algorithms in time series analysis and forecasting. TSF is based on state space model methodology that includes all types of exponential smoothing, some autoregressive algorithms, and innovative algorithms for event detection and calendar event impact prediction. TSF implements an Excel interop layer and offers an Excel add-in that exposes a large subset of TSF functionality through the Excel ribbon UI.
Automata is a .NET tool kit that provides facilities for manipulating and analyzing regular expressions, symbolic finite automata, and symbolic finite transducers. It supports automata and transducers where input and output alphabets can be fully symbolic. Constraints over the alphabets can be analyzed using Satisfiability Modulo Theory (SMT) solvers. The tool kit provides a particular extension that uses the Microsoft SMT solver Z3.