Microsoft Research PhD Scholarship: notification of awards
- Anton Igorevich Kan, University of Cambridge
A Simple System for Morphogenetic Engineering
- Aleš Bizjak, Aarhus Universitet
Relational Reasoning for Programs using Higher-Order Store
- Alexander Fokas, University of Cambridge
Investigation of a radically new energy technology based upon programmable Artificial Photosynthesis
- Andrea Giugliano, University of Leicester
Future filesystems: mechanized specification, validation, implementation and verification of filesystems
- David Swasey, Max-Planck Institute (Software Systems)
Compositional Verification of Scalable Joins by Protocol-Based Refinement
- Georgios Papamakarios, University of Edinburgh
Solving the problem of cascading costs: better approximate Bayesian inference for data pipelines
- Kapil Dev, Lancaster University
Intuitive and Efficient Design of Enclosures for .NET Gadgeteer
- Kareem Khazem, University College London
Weakness as a virtue
- Marco Fraccaro, Technical University of Denmark
Learning to Index
- Martin Szymczak, University of Edinburgh
Bayesian Probabilistic Programming for Security
- Mattia Cinelli, University College London
Experimental and computational studies on the human antigen-specific T cell repertoire
- Moos Hueting, University College London
3D Reconstruction of Live Scenes Using Multiple Heterogeneous Mobile Depth Cameras
- Nooshin Sadat Mirzadeh, École Polytechnique Fédérale de Lausanne (ÉPFL)
3D Smart Memory for Scale-Out Servers
- Omer Lev, Hebrew University
All-Pay Auctions in the Real World
- Paul-Jules Micolet, University of Edinburgh
Performance Portability for Large-Scale Heterogeneous Systems
- Peter Backeman, Uppsala University
First-Order Satisfiability Modulo Theories
- Petko Georgiev, University of Cambridge
Exploiting Mobile Sensing and Geo-social information in Mobile Recommendation Systems
- Stanley Strawbridge, University of Cambridge
Understanding the Dynamics of Embryonic Stem Cells Differentiation: A Combined Experimental and Modeling
- Tim Rocktaschel, University College London
Probabilistic Databases of Multimodal and Universal Schema
- Thomas Reitmaier, University of Cape Town
Natural User Interfaces for the Developing World
- Weili Fu, University of Edinburgh
Provenance for configuration language security
- Yani Ioannou, University of Bath
Expert Visual Classification with Thousands of Categories
Anton Igorevich Kan
University of Cambridge
Research title: A Simple System for Morphogenetic Engineering
Supervisor: Dr Jim Hasseloff, University of Cambridge
Microsoft Research supervisor: Andrew Phillips
Research summary: Biological self-organisation is a complex and emergent process, whereby many interacting cells robustly grow and differentiate into a whole organism. This project aims to engineer a simple system capable of programmable morphogenesis. By using a simplified system, we have a high level of understanding and control over the cells themselves and their interactions.
This project will focus on altering the morphology of surface growing Escherichia coli colonies, since they are relatively well understood and easy to manipulate. In order to change the morphology of the colony in vivo, the genetic makeup of the bacteria will be manipulated in order to regulate cellular shape, adhesion and growth rates. The altered biophysical interactions between cells will in turn alter colony morphology, and such colonies shall be imaged with high-resolution microscopy and characterised through the use of image analysis to quantify the interactions within colonies.
Computational models of the bacterial colonies, that take into account the physical and genetic aspects of cells, are used to further understand the details of the mechanisms involved that are not amenable to experimentation. Through the modelling of the development of particular morphologies, the models can also serve as tools to design biological architecture.
Aarhus Universitet, Denmark
Research title: Relational Reasoning for Programs using Higher-Order Store
Supervisor: Prof Lars Birkedal, Aarhus Universitet
Microsoft Research supervisors: Nick Benton and Andrew Kennedy
Research summary: The research project is concerned with developing models and reasoning techniques for programming languages with a combination of sophisticated features, such as higher-order functions, impredicative polymorphism, and general references. Models of such languages allow us to precisely state and prove properties of programs and provide criteria for judging, and methods for proving, correctness of language implementations, i.e. compilers and interpreters.
University of Cambridge, UK
Research title: Investigation of a Radically New Energy Technology Based Upon Programmable Artificial Photosynthesis
Supervisor: Alex Chin, University of Cambridge
Microsoft Research supervisor: Stephen Emmott
Research summary: There is an urgent need to find alternative (non-carbon-based) sources of energy to both meet a threefold increase in energy demand this century, and to mitigate climate change effects. As a consequence, the concept of Artificial Photosynthesis is rapidly gaining momentum; the prevailing research focus being on the development of solid-state materials able to split water and extract the resulting hydrogen as energy. We propose a radically new, conceptually, form of Artificial Photosynthesis based on photosynthesis as programmable computation, implementable through Living Software on a ‘metabolic material’, to harvest light and convert it to directly usable energy. The project is highly innovative, and is the first ever of its kind, focused at the intersection of twenty-first century computing (programming biology) and a key twenty-first century societal challenge—solving the energy crisis.
University of Leicester, UK
Research title: Future File Systems: Mechanized Specification, Validation, Implementation and Verification of File Systems
Supervisor: Tom Ridge, University of Leicester
Microsoft Research supervisor: Andrew Kennedy
Research summary: We aim to produce a formal specification of file system behaviour, and a verified file system implementation. The specification will be validated against existing real-world implementations, to ensure that it is reasonable. The implementation will be verified correct according to the specification, and the proof mechanized in the HOL4 theorem prover. This would result in a file system implementation with exceptionally strong reliability guarantees.
The specification and implementation have many applications. For example, the specification could be used as a foundation for the further verification of applications that use a file system, such as databases and persistent message queues. The implementation could be used as a key component in systems with high reliability requirements, such as NASA space exploration missions. A verified file system would also be a key component of a fully-fledged verified operating system, such as those currently being developed at Microsoft and elsewhere.
Max-Planck Institute (Software Systems), Germany
Research title: Compositional Verification of Scalable Joins by Protocol-Based Refinement
Supervisor: Derek Dreyer, Max Planck Institute for Software Systems (MPI-SWS)
Microsoft Research supervisor: Nick Benton
Research summary: Fournet and Gonthier's "join calculus" supports a powerful combination of message-passing concurrency and declarative atomicity, thus enabling the convenient encoding of higher-level synchronization abstractions. Turon and Russo have recently developed an efficient implementation of the C# library for joins that scales well to multicore architectures. In this project, we aim to verify a realistic software stack for scalable joins, including both Turon and Russo's implementation and a representative suite of joins-based client code that depends on it. Our hypothesis is that a feasible and effective way to achieve this goal is by using contextual refinement to isolate the verification of the implementation from the verification of the client code. Toward this end, we propose to use recent advances in the technology of "Kripke logical relations" in order to formalize the "protocols" that govern the use of private state in both implementation and client code.
University of Edinburgh, UK
Research title: Solving the Problem of Cascading Costs: Better Approximate Bayesian Inference for Data Pipelines
Supervisor: Iain Murray
Microsoft Research supervisor: John Winn
Research summary: Bayesian statistical methods give state-of-the-art performance in many machine learning applications. However, full joint Bayesian inference is rarely used in large-scale applications with several stages of data processing. Early stages of a pipeline are often seen as "pre-processing", processes outside of a statistical model, and model criticism is usually also a separate manual stage. We aim to provide tools that allow more steps of processing to be routinely part of probabilistic analyses. We hypothesise that new Monte Carlo fitting methods will give practical tools to achieve this aim, resulting in more accurate and trustworthy results for a variety of data-processing tasks involving pipelines. Successful proof-of-concepts will be developed into general tools, where possible as extensions of Microsoft Research’s Infer.NET project.
Lancaster University, UK
Research title: Intuitive and Efficient Design of Enclosures for .NET Gadgeteer
Supervisor: Manfred Lau, University of Lancaster
Microsoft Research supervisor: Nicolas Villar
Research summary: The Gadgeteer framework provides a platform for designing and fabricating electronic devices with a set of small electronic components. The current approach for modelling an enclosure (or physical case) for such a device is with Solidworks, and this process typically takes hours. We propose to develop a set of easy-to-use 3D modelling and digital fabrication tools that novices can use to design an enclosure in the order of minutes. We propose to explore various methods to build these tools, including: a 3D hand gestures and augmented reality interface, a 3D component-centric interface, and grammar-based representations and algorithms. We will take a user-oriented approach to evaluate our interactive tools.
University College London, UK
Research title: Weakness as a Virtue
Supervisor: Jade Alglave
Microsoft Research supervisor: Byron Cook
Research summary: The number of interleavings that a concurrent program can have is typically identified as the root cause of the difficulty of automatic analysis of concurrent software. Weak memory, as implemented by modern multiprocessors such as Intel x86, IBM Power and ARM, is generally believed to make this problem even harder. On the contrary, I believe that weakness can be a virtue.
The hypothesis of this proposal is that by embracing rather than fleeing from weak memory, we can obtain efficient verification techniques. We will experiment with this hypothesis by rejecting total orders when modelling the executions of concurrent programs. Following the partial order semantics tradition, we will model executions with partial orders. I believe that these models have a great potential for practical verification, which has not been fully realised yet.
Technical University of Denmark
Research title: Learning to Index
Supervisor: Ole Winther, Technical University of Denmark
Microsoft Research supervisor: Ulrich Paquet
Research summary: Algorithmic scalability is important for “Big Data” Machine Learning. However, an oft-neglected axis is that of the scalability of decision-making. For matrix factorization-based recommender systems, it is a linear in the number of objects. Under a strict time budget with millions of objects, sublinear retrieval algorithms are required for true scalability. Fast tree structures exist for O(log n) retrieval when the triangle inequality holds between objects, or when the objects are inherently indexable. However, current algorithms are not principally designed to operate within a sublinear retrieval criterion.
Models can be specified such that it is forced to learn a good index over the data: This proposal investigates how a “matrix factorization” recommender can learn constrained parameters that are amenable to O(log n) indexing. This setting of “learning to index” is of cardinal importance to large-scale online retrieval, which is hampered by the lack of indexable solutions.
University of Edinburgh
Research title: Bayesian Probabilistic Programming for Security
Supervisor: David Aspinall
Microsoft Research supervisor: Andy Gordon
Research summary: Our hypothesis is that a probabilistic programming language will enable us to formulate computer security problems as Bayesian inference problems and to solve them using state-of-the-art inference algorithms. We want to investigate this with a PhD project using the language Fun invented at Microsoft Research. The project will code existing and new security analyses based on a range of Bayesian methods, such as those used from evaluation/attack of systems, estimating trustworthiness of network participants, measuring information hiding, anomaly detection, and forensics.
University College London
Research title: Experimental and Computational Studies on the Human Antigen-Specific T Cell Repertoire
Supervisor: Benny Chain
Microsoft Research supervisor: Neil Dalchau
Research summary: The adaptive immune response is defined by clonal expansion of individual lymphocytes in response to antigen. However, the number of different clones responding to any given antigen, and the relationship between clonal frequency and antigen specificity and affinity remain unknown. In this project, we use high throughput sequencing to determine the repertoire of T cells responding to specific antigens. We will develop machine learning algorithms which will be used to classify these T cell receptors, and distinguish between antigen specific and antigen non-specific receptors. The project will also work with Microsoft Research to integrate the frequency data obtained with deterministic models of T cell activation, thus contributing to the overall aim of building a multi-scale model of the evolution of an antigen specific immune response.
University College London
Research title: 3D Reconstruction of Live Scenes Using Multiple Heterogeneous Mobile Depth Cameras
Supervisor: Jan Kautz
Microsoft Research supervisor: Shahram Izadi
Research summary: Imagine an event like a concert, a play or even a formal ball, that is taking place at a distant location. Given its location, you cannot attend in person but you would still like to view and explore the event as if you were attending. Thus, you should be able to roam around freely and explore the event live from any viewpoint. Recent advances in camera technology, in particular depth-sensing (Kinect) and stereo-capable cameras (e.g., some mobile phones) from which depth can be derived, can aid in this context. We propose to use a heterogeneous network of Kinect cameras and stereo-capable mobile devices, where the Kinect cameras would be statically mounted and the 3D mobile devices would be worn / carried by attendees of the event. This proposed setup allows us to collect rich colour and depth data from many viewpoints. Given this input, we believe it possible to reconstruct enough information to facilitate inspecting the scene from any viewpoint at any time instance.
Nooshin Sadat Mirzadeh
École Polytechnique Fédérale de Lausanne (ÉPFL)
Research title: 3D Smart Memory for Scale-Out Servers
Supervisor: Babak Falsafi, EPFL
Microsoft Research supervisor: Dushyanth Narayanan
Research summary: With the end of Dennard scaling on the horizon and the emergence of Big Data and the growing demands for processing, communication and storing massive data, 3D-stacked IC promises to be viable approach to scalability in servers. While 3D-stacked RAM is already available as a product, there a number of fundamental challenges related to power and thermal density, precluding the use of stacking of server grade logic dies on top of DRAM. Moreover, while the recently proposed scale-out processor architectures based on efficient mobile cores dramatically improve performance density and throughput in servers (e.g., by 10x), they do not alleviate the power density and thermal problems in stacking as compared to conventional server dies because they operate at a commensurate power budget. In this proposal, we will investigate logic die organizations using extremely low-power cores that would readily be stackable on existing DRAM stacks with available thermal and power density budgets.
Research title: All-Pay Auctions in the Real World
Supervisor: Jeffrey Rosenschein, Hebrew University
Microsoft Research supervisor: Yoram Bachrach
Research summary: In the past decade, the field of algorithmic game theory (AGT) has grown considerably, both with respect to advances in research, as well as a significant increase in its real-world application. The use of AGT includes the utilization of game-theoretic concepts (such as maximizing auction revenue and making voting less easily manipulated) in various Internet scenarios. One of the major areas that has developed significantly is the involvement of a large number of individuals in working, as freelancers, on specific problems or tasks, i.e., crowdsourcing. Interest in crowdsourcing has rekindled research into all-pay auctions, which can help in modeling these interactions. We propose exploring various novel aspects of all-pay auctions, such as the possibility of mergers and collusions, and the use of all-pay auctions in real-world mechanisms.
University of Edinburgh
Research title: Performance Portability for Large-Scale Heterogeneous Systems
Supervisor: Christophe Dubach
Microsoft Research supervisor: Ant Rowstron
Research summary: Computing systems have become increasingly complex and hard to program. This is especially true for data centres since these systems have become more heterogeneous with the recent availability of GPUs. As a result, achieving high performance for such large-scale systems is an extremely challenging task. This problem is further exacerbated with each new hardware generation, which means software written and tuned for today's systems will need to be adapted frequently to keep pace with ever changing hardware.
This proposal is an attempt to tackle the inherent complexity of computing systems by: (1) abstracting away the details of these systems using a hierarchical hardware description, (2) providing a higher-level programming model that does not expose any specific hardware features, and (3) automatically tuning the software to the hardware using statistical techniques. The key idea is that software should be written only once and automatically tune itself for the underlying system.
Research title: First-Order Satisfiability Modulo Theories
Supervisor: Philipp Ruemmer, Uppsala University
Microsoft Research supervisor: Christophe Wintersteiger
Research summary: SAT and SMT solvers form the backbone of many of today's verification systems, responsible for discharging verification conditions that encode correctness properties of hardware or software designs. SMT solvers are essentially ground decision procedures, in the sense that quantifier treatment is implemented by means of instantiation on top of a procedure for quantifier-free reasoning. For verification and other applications, efficient handling of quantifiers has been identified as one of the main challenges in the development of solvers.
This proposal is about lifting the SMT paradigm to the first-order level, by including quantifiers as first-class citizens in solvers.
University of Cambridge
Research title: Exploiting Mobile Sensing and Geo-social information in Mobile Recommendation Systems
Supervisor: Cecilia Mascolo, University of Cambridge
Microsoft Research supervisor: Natasa Milic-Frayling
Research summary: In this proposal we extend location based recommendation by bridging the gap between mobile sensing, location- based and on-line systems. We aim to address the problem of collecting user contextual information in real time from sensor equipped smartphones and integrate that with social and geographical information, also gathered through the interaction of phones with the mobile web, in order to offer optimal content on-line and geographic suggestions to users. The technical challenges of providing high quality recommendations are:
- The ability to accurately detect the user context in presence of energy limitations for phone sensing
- The integration of the most effective spatio-temporal data mining features in an environment of sparse, very large datasets
- The urban sensing of events in the city by observing large streams of online data for the detection of ongoing events
- The offering of online recommendations using real time back end computation of current user input
University of Cambridge
Research title: Understanding the Dynamics of Embryonic Stem Cells Differentiation: A Combined Experimental and Modeling
Supervisor: Graziano Martello, University of Cambridge
Microsoft Research supervisor: Hillel Kugler
Research summary: Pluripotency is a key feature of embryonic stem (ES) cells, and is defined as the ability to form all cell lineages present in the adult body (Smith, 2006). Pluripotent ES cells are thus potentially a great biomedical resource and, at the same time, a unique and intellectually fascinating cell type. For these reasons ES cells have been extensively investigated in the past three decades, leading to a comprehensive understanding of the signals maintaining pluripotency and of the transcription factors that constitute the gene regulatory network (GRN) of ES cells. Surprisingly, very little is known about how ES cells exit the pluripotent state and start to differentiate. Our aim is to understand the molecular events associated with this transition and its dynamics.
University College London
Probabilistic Databases of Multimodal and Universal Schema
Supervisor: Sebastian Riedel
Microsoft Research supervisor: Thore Graepel
Research summary: Databases can readily return answers explicitly stored in their tables, but it is difficult to query the implicit information captured in the database. To overcome this problem, we propose to develop probabilistic databases with relations corresponding to heterogeneous and multimodal content (number of Starbucks per km2), and relations corresponding to natural language surface patterns (“is gentrified”) appearing in text. The databases will automatically learn correlations between the multimodal and the surface pattern relations, and hence induce (some notion of) the meaning of surface patterns, grounded both with respect to other surface patterns, and to the databases’ multi-modal content. Querying implicit content then amounts to using the surface pattern relations in a query. Our approach to realize such databases will be based on latent feature models and matrix completion methods, and will require innovation in the intersection of IE, Semantics, Machine Learning and Databases.
University of Cape Town
Research title: Natural User Interfaces for the Developing World
Supervisor: Prof Edwin Blake
Microsoft Research supervisor: Richard Harper
Research summary: The goal of this research is to investigate the design and development of interfaces that are ‘natural’ for those living in the developing world.
Natural user interfaces (NUI) hold out the promise of revolutionising the experience and use of computers. To date, the bulk of research in this area has explored natural interfaces for the office and home of the twenty-first century in the developed world. Much less research attention has been given to creating such interfaces for the developing world. This is because innovation in this setting requires a much broader approach to create something that is both practical to design and build, and which deals with the particular constraints of the developing world.
University of Edinburgh
Research title: Provenance for Configuration Language Security
Supervisor: James Cheney
Microsoft Research supervisor: Dimitrios Vytiniotis
Research summary: Declarative, high-level configuration languages (e.g., LCFG, Puppet, Chef) are widely used in industry to configure large system installations. Configurations are often composed from distributed source files managed by many different users within different system and organisational boundaries. Users may make changes whose consequences are not easy to understand, and such systems also currently lack mature security access controls; the few currently available techniques have idiosyncratic behaviour and offer no formal guarantees. In the worst case, misconfiguration can lead to costly system failures; because of the complexity of the configuration build, it is difficult to recover from failures, trace the source of the error or identify the responsible party. In this project, we will explore the application of provenance techniques (originally developed in the context of databases) to establishing well-founded and effective techniques for security and audit for configuration languages.
University of Bath
Research title: Expert Visual Classification with Thousands of Categories
Supervisor: Matthew Brown
Microsoft Research supervisor: Antonio Criminisi
Research summary: Most work in visual classification is aimed at emulating, rather than augmenting, human recognition performance. This is partially attributable to the current paradigm in object recognition, which involves non-expert human labelling of a large number of training images, followed by supervised classification. This approach works well for small numbers of visually sparse classes, but fails to replicate the abilities of experts in more technical domains, such as medical image understanding. Furthermore, state-of-the-art techniques fail to scale effectively to large numbers of visual categories, which are commonplace in medical or biological taxonomies. In this project we will develop general techniques for scalable, expert visual classifiers that augment the capabilities of a typical person, aiming to furnish them with the visual insight of a botanist, architect or doctor.