Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (445)
+
Events (410)
 
Groups (142)
+
News (2624)
 
People (736)
 
Projects (1066)
+
Publications (12096)
+
Videos (5323)
Labs
Research areas
Algorithms and theory47205 (290)
Communication and collaboration47188 (189)
Computational linguistics47189 (189)
Computational sciences47190 (198)
Computer systems and networking47191 (691)
Computer vision208594 (876)
Data mining and data management208595 (73)
Economics and computation47192 (95)
Education47193 (79)
Gaming47194 (71)
Graphics and multimedia47195 (207)
Hardware and devices47196 (196)
Health and well-being47197 (78)
Human-computer interaction47198 (792)
Machine learning and intelligence47200 (769)
Mobile computing208596 (35)
Quantum computing208597 (20)
Search, information retrieval, and knowledge management47199 (625)
Security and privacy47202 (273)
Social media208598 (26)
Social sciences47203 (247)
Software development, programming principles, tools, and languages47204 (564)
Speech recognition, synthesis, and dialog systems208599 (78)
Technology for emerging markets208600 (27)
1–25 of 564
Sort
Show 25 | 50 | 100
1234567Next 
Emerson Murphy-Hill, Thomas Zimmermann, Christian Bird, and Nachiappan Nagappan

When software engineers fix bugs, they may have several options as to how to fix those bugs. Which fix they choose has many implications, both for practitioners and researchers: What is the risk of introducing other bugs during the fix? Is the bug fix in the same code that caused the bug? Is the change fixing the cause or just covering a symptom? In this paper, we investigate alternative fixes to bugs and present an empirical study of how engineers make design choices about how to fix bugs. We start...

Publication details
Date: 1 December 2015
Type: Article
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Abram Hindle, Christian Bird, Thomas Zimmermann, and Nachiappan Nagappan

Large organizations like Microsoft tend to rely on formal requirements documentation in order to specify and design the software products that they develop. These documents are meant to be tightly coupled with the actual implementation of the features they describe. In this paper we evaluate the value of high-level topic-based requirements traceability and issue report traceability in the version control system, using Latent Dirichlet Allocation (LDA). We evaluate LDA topics on practitioners...

Publication details
Date: 1 December 2015
Type: Article
Publisher: Springer
Gordon Stewart, Mahanth Gowda, Geoffrey Mainland, Bozidar Radunovic, Dimitrios Vytiniotis, and Cristina Luengo Agulló

Software-defined radio (SDR) brings the flexibility of software to wireless protocol design, promising an ideal platform for innovation and rapid protocol deployment. However, implementing modern wireless protocols on existing SDR platforms often requires careful hand-tuning of low-level code, which can undermine the advantages of software.

Ziria is a new domain-specific language (DSL) that offers programming abstractions suitable for wireless physical (PHY) layer tasks while emphasizing the...

Publication details
Date: 1 March 2015
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Robert A Cochran, Loris D’Antoni, Benjamin Livshits, David Molnar, and Margus Veanes

In this paper, we investigate an approach to program synthesis that is based on crowd-sourcing. With the help of crowd-sourcing, we aim to capture the “wisdom of the crowds” to find good if not perfect solutions to inherently tricky programming tasks, which elude even expert developers and lack an easy-to-formalize specification.

We propose an approach we call program boosting, which involves crowd-sourcing imperfect solutions to a difficult programming problem from developers and then...

Publication details
Date: 1 January 2015
Type: Proceedings
Publisher: ACM – Association for Computing Machinery
Thomas BALL and Jakub DANIEL

Dynamic symbolic execution (DSE) is a well-known technique for automatically generating tests to achieve higher levels of coverage in a program. Two keys ideas of DSE are to: (1) seed symbolic execution by executing a program on an initial input; (2) using concrete values from the program execution in place of symbolic expressions whenever symbolic reasoning is hard or not desired. We describe DSE for a simple core language and then present a minimalist implementation of DSE for Python (in Python) that...

Publication details
Date: 1 January 2015
Type: Article
Publisher: IOS Press
Menghui Lim, Jian-Guang LOU, Hongyu Zhang, Qiang FU, Andrew Teoh, Qingwei LIN, Rui Ding, and Dongmei Zhang

For a large-scale software system, especially an online service system, when a performance issue occurs, it is desirable to check whether this issue has occurred before. If there are past similar issues, a known remedy could be applied. Otherwise, a new troubleshooting process may have to be initiated. The symptom of a performance issue can be characterized by a set of metrics. Due to the sophisticated nature of software systems, manual diagnosis of performance issues based on metric data is typically...

Publication details
Date: 14 December 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Tom Crick, Benjamin A. Hall, Samin Ishtiaq, and Kenji Takeda

The reproduction and replication of reported scientific results is a hot topic within the academic community. The retraction of numerous studies from a wide range of disciplines, from climate science to bioscience, has drawn the focus of many commentators, but there exists a wider socio-cultural problem that pervades the scientific community. Sharing data and models often requires extra effort, and this is currently seen as a significant overhead that may not be worth the time investment....

Publication details
Date: 1 December 2014
Type: Inproceeding
Zhenyu Guo, Cheng Chen, Haoxiang Lin, Sean McDirmid, Fan Yang, Xueying Guo, Mao Yang, and Lidong Zhou

Our cloud services are losing too many battles to faults like software bugs, resource interference, and hardware failures. Many tools can help us win these battles: model checkers to verify, fault injection to find bugs, replay to debug, and many more. Unfortunately, tools are currently afterthoughts in cloud service designs that must either be tediously tangled into service implementations or integrated transparently in ways that fail to effectively capture the service’s problematic non-deterministic...

Publication details
Date: 26 November 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-150
Baishakhi Ray, Meiyappan Nagappan, Christian Bird, Nachiappan Nagappan, and Thomas Zimmermann

Changes in software development come in many forms. Some changes are frequent, idiomatic, or repetitive (e.g. adding checks for nulls or logging important values) while others are unique. We hypothesize that unique changes are different from the more common similar (or non-unique) changes in important ways; they may require more expertise or represent code that is more complex or prone to mistakes. As such, these changes are worthy of study. In this paper, we present a definition of unique changes and...

Publication details
Date: 25 November 2014
Type: Technical report
Number: MSR-TR-2014-149
Benjamin Livshits and George Kastrinis

Crowd-sourcing is increasingly being used for providing answers to online polls and surveys. However, existing systems, while taking care of the mechanics of attracting crowd workers, poll building, and payment, generally provide little by way of cost-management (e.g. working with a tight budget), time-management (e.g. obtaining results as quickly as possible), and controlling the margin of error (e.g. working on a sample population which is largely different from the general census statistics). The...

Publication details
Date: 14 November 2014
Type: Technical report
Number: MSR-TR-2014-145
Benjamin Livshits and Todd Mytkowicz

Crowd-sourcing is increasingly being used for largescale polling and surveys. Companies such as SurveyMonkey and Instant.ly make crowd-sourced surveys commonplace by making the crowd accessible through an easy-to-use UI with easy to retrieve results. Further, they do so with a relatively low latency by having dedicated crowds at their disposal. In this paper we argue that the ease with which polls can be created conceals an inherent difficulty: the survey maker does not know how many workers to hire for...

Publication details
Date: 2 November 2014
Type: Inproceeding
Publisher: AAAI - Association for the Advancement of Artificial Intelligence
Kim Herzig

Software quality is one of the most pressing concerns for nearly all software developing companies. At the same time, software companies also seek to shorten their release cycles to meet market demands while maintaining their product quality. Identifying problematic code areas becomes more and more important. Defect prediction models became popular in recent years and many different code and process metrics have been studied. There has been minimal effort relating test executions during development with...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: IEEE – Institute of Electrical and Electronics Engineers
Akash Lal and Shaz Qadeer

he application of software-verification technology towards building realistic bug-finding tools requires working through several precision-scalability tradeoffs. For instance, a critical aspect while dealing with C programs is to formally define the treatment of pointers and the heap (usually termed as the “memory model”). A machine-level modeling is often intractable, whereas one that leverages high-level information (such as types) can be inaccurate. Another tradeoff is modeling integer arithmetic....

Publication details
Date: 1 November 2014
Type: Inproceeding
Andrew D. Gordon, Claudio Russo, Marcin Szymczak, Johannes Borgstrom, Nicolas Rolland, Thore Graepel, and Daniel Tarlow

We describe the design, semantics, and implementation of a probabilistic programming language where programs are spreadsheet queries. Given an input database consisting of tables held in a spreadsheet, a query constructs a probabilistic model conditioned by the spreadsheet data, and returns an output database determined by inference. This work extends probabilistic programming systems in three novel aspects: (1) embedding in spreadsheets, (2) dependently-typed functions, and (3) typed distinction...

Publication details
Date: 1 November 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-135
Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton

Every programmer has a characteristic style, ranging from preferences about identifier naming to preferences about object relationships and design patterns. Coding conventions define a consistent syntactic style, fostering readability and hence maintainability.

collaborating, programmers strive to obey a project’s coding conventions. However, one third of reviews of changes contain feedback about coding conventions, indicating that programmers do not always follow them and that project members...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Alex Taylor, Jasmin Fisher, Byron Cook, Samin Ishtiaq, and Nir Piterman

Computational biology is a nascent field reliant on software coding and modelling to produce insights into biological phenomena. Extreme claims cast it as a field set to replace conventional forms of experimental biology, seeing software modelling as a (more convenient) proxy for bench-work in the wet-lab. In this article, we deepen and complicate the relations between computation and scientific ways of knowing by discussing a computational biology tool, BMA, that models gene regulatory networks. We...

Publication details
Date: 1 November 2014
Type: Article
Tom Crick, Benjamin A. Hall, and Samin Ishtiaq

The reproduction and replication of novel scientific results has
become a major issue for a number of disciplines. In computer science
and related disciplines such as systems biology, the issues closely
revolve around the ability to implement novel algorithms and
approaches. Taking an approach from the literature and applying it in
a new codebase frequently requires local knowledge missing from the
published manuscripts and project websites. Alongside this...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: WSSSPE
Chengnian SUN, Haidong Zhang, Jian-Guang LOU, Hongyu ZHANG, Qiang WANG, Siau-Cheng Khoo, and Dongmei ZHANG

In a bug tracking system (e.g. Bugzilla), the lifetime of a bug report usually consists of a sequence of revisions to itself, referred to as bug report evolution in this paper. Each revision contains changes to one or more fields in the bug report. Such evolution information is an essential indicator of software process maturity evaluation in an organization. Understanding bug report evolution is also useful for research on mining bug repositories. However, current bug tracking systems provide limited...

Publication details
Date: 1 November 2014
Type: Proceedings
Danyel Fisher, Badrish Chandramouli, Robert DeLine, Jonathan Goldstein, Andrei Aron, Mike Barnett, John C. Platt, James F. Terwilliger, John Wernsing, danyelf badrishc, and rdeline jongold

Over the last two decades, data scientists performed increasingly sophisticated analyses on larger data sets, yet their tools and workflows remain low-level. A typical analysis involves different tools for different stages of the work, requiring file transfers and considerable care to keep everything organized. Temporal data adds additional complexity: users typically must write queries offline before porting them to production systems. To address these problems, this paper introduces Tempe, a web...

Publication details
Date: 1 November 2014
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2014-148
André N. Meyer, Thomas Fritz, Gail C. Murphy, and Thomas Zimmermann

The better the software development community becomes at creating software, the more software the world seems to demand. Although there is a large body of research about measuring and investigating productivity from an organizational point of view, there is a paucity of research about how software developers, those at the front-line of software construction, think about, assess and try to improve their productivity. To investigate software developers' perceptions of software development productivity, we...

Publication details
Date: 1 November 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Position Paper

Event-driven programming avoids wasting user and CPU time, but is difficult to perform since program control flow is necessarily inverted and twisted. To make reactive programming easier, many advocate burying control flow within abstractions that are composed via data flow instead. This might be a mistake: data-flow has issues with expressiveness and usability that might not pan out. Instead, control flow could be re-invented to hide the adverse affects of CPU time while preserving expressiveness and...

Publication details
Date: 21 October 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Chris Hawblitzel, Jon Howell, Jacob R. Lorch, Arjun Narayan, Bryan Parno, Danfeng Zhang, and Brian Zill

An Ironclad App lets a user securely transmit her data to a remote machine with the guarantee that every instruction executed on that machine adheres to a formal abstract specification of the app’s behavior. This does more than eliminate implementation vulnerabilities such as buffer overflows, parsing errors, or data leaks; it tells the user exactly how the app will behave at all times. We provide these guarantees via complete, low-level software verification. We then use cryptography and secure...

Publication details
Date: 6 October 2014
Type: Inproceeding
Publisher: USENIX – Advanced Computing Systems Association
Milos Gligoric, Wolfram Schulte, Chandra Prasad, Danny van Velzen, Iman Narasamdya, and Benjamin Livshits

The efficiency of a build system is an important factor for developer productivity. As a result, developer teams have been increasingly adopting new build systems that allow higher build parallelization. However, migrating the existing legacy build scripts to new build systems is a tedious and error-prone process. Unfortunately, there is insufficient support for automated migration of build scripts, making the migration more problematic.

We propose the first dynamic approach for...

Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: ACM – Association for Computing Machinery
Akash Lal and Shaz Qadeer

A goal-directed search attempts to reveal only relevant information needed to establish reachability (or unreachability) of the goal from the initial state of the program. The further apart the goal is from the initial state, the harder it can get to establish what is relevant. This paper addresses this concern in the context of programs with assertions that may be nested deeply inside its call graph—thus, far away interprocedurally from main. We present a source-to-source transformation on...

Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: FMCAD
Sean McDirmid and Jonathan Edwards

Most languages expose the computer’s ability to globally read and write memory at any time. Programmers must then choreograph control flow so all reads and writes occur in correct relative orders, which can be difficult particularly when dealing with initialization, reactivity, and concurrency. Just as many languages now manage memory to unburden us from properly freeing memory, they should also manage time to automatically order memory accesses for us in the interests of...

Publication details
Date: 1 October 2014
Type: Inproceeding
Publisher: ACM
1–25 of 564
Sort
Show 25 | 50 | 100
1234567Next 
> Our research