Nikolai Tillmann

Nikolai Tillmann
PRINCIPAL DEVELOPMENT LEAD
.

My main areas of research are program authoring on mobile devices, program analysis, testing, optimization, and verification.

I started the TouchDevelop project, which enables end-users to write programs for mobile devices on mobile devices. This project brings the excitement of the first programmable computers to mobile devices such as smartphones. touchdevelop.com

I also am leading the Pex project, a framework and tool for runtime verification and automatic test case generation for .NET applications based on parameterized unit testing and dynamic symbolic execution. Fakes, the test isolation framework formerly called Moles, shipped with Visual Studio 2012 / 2013. Code Digger is a radically simplified version of Pex, available as a free extension for Visual Studio 2012 / 2013.

To make the Pex engine readily available, I created Pex4Fun, a web-based version where anyone can write and explore code in a browser. pex4fun.com

When the idea came up for a coding game based on the Pex engine, we created Code Hunt. codehunt.com

Previously, I was involved in the Spur project, where I was working on a tracing Just-In-Time compiler for .NET and JavaScript code. I worked on AsmL, an executable modeling language that comes with a compiler and a test generation tool, and the Spec Explorer 2004 model-based testing tool. We developed XRT, a concrete/symbolic state exploration engine and software model-checker for .NET code. Spec Explorer 2007 is based on this engine, which is now productized internally by the Protocol Engineering Team at Microsoft to facilitate quality assurance of protocol documentation (link).

Before coming to Microsoft Research, I was involved in the development of a school management system in Germany.

Publications

    2014

    • Nikolai Tillmann, Michal Moskal, Peli de Halleux, Sebastian Burckhardt, Tom Ball, and Judith Bishop, Tutorial: Create Rich Mobile Apps on Touch Devices, in Proceedings of the First International Conference on Mobile Software Engineering and Systems, , June 2014
      We are experiencing a technology shift: Powerful and easy-to-use mobile devices like smartphones and tablets are becoming more prevalent than traditional PCs and laptops. Mobile devices are going to be the first and, in less developed countries, possibly the only computing devices which virtually all people will own and carry with them at all times. In this tutorial, participants will learn about developing software directly on their mobile devices. The tutorial is based on TouchDevelop, a modern software development environment that embraces this new reality, treating mobile devices as first-class software development machines, instead of relying on legacy development models built around PC. TouchDevelop comes with typed, structured programming language that is built around the idea of only using a touchscreen as the input device to author code. Access to the cloud, flexible user interfaces, and access to sensors such as accelerometer and GPS are available as a first-class citizens in the programming language. TouchDevelop is available as a web app on Windows tablets, iOS, Android, Windows PCs and Macs, and as a native app on Windows Phone.
    • Michael Hilton, Arpit Christi, Danny Dig, Michal Moskal, Sebastian Burckhardt, and Nikolai Tillmann, Refactoring Local to Cloud Data Types for Mobile Apps, in Proceedings of the First International Conference on Mobile Software Engineering and Systems, , June 2014
      Mobile cloud computing can greatly enrich the capabilities of today’s pervasive mobile devices. Storing data on the cloud can enable features such as automatic backup, seamless transition between multiple devices, and multiuser support for existing apps. However, the process of converting local into cloud data types requires high expertise, is difficult, and time-consuming. Refactoring techniques can greatly simplify this process. In this paper we present a formative study where we analyzed and successfully converted four real-world touchdevelop apps into cloud-enabled apps. Based on these lessons, we designed and implemented, CLOUDIFYER, a tool that automatically refactors local data types into cloud data types on the touchdevelop platform. Our empirical evaluation on a corpus of 123 mobile apps resulting in 2722 transformations shows (i) that the refactoring is widely applicable, (ii) CLOUDIFYER saves human effort, and (iii) CLOUDIFYER is accurate.
    • Nikolai Tillmann, Judith Bishop, R. Nigel Horspool, Daniel Perelman, and Tao Xie, Code Hunt: Searching for Secret Code for Fun, in Proceedings of the International Conference on Software Engineering (Workshops), ACM ICSE Workshop on Seach Based Software Testing 2014, 1 June 2014
      Learning to code can be made more effective and sustainable if it is perceived as fun by the learner. Code Hunt uses puzzles that play- ers have to explore by means of clues presented as test cases. Play- ers iteratively modify their code to match the functional behaviour of secret solutions. This way of learning to code is very different to learning from a specification. It is essentially re-engineering from test cases. Code Hunt is based on the test/clue generation of Pex, a white-box test generation tool that uses dynamic symbolic execu- tion. Pex performs a guided search to determine feasible execution paths. Conceptually, solving a puzzle is the manual process of con- ducting search-based test generation: the “test data” to be generated by the player is the player’s code, and the “fitness values” that re- flect the closeness of the player’s code to the secret code are the clues (i.e., Pex-generated test cases). This paper is the first one to describe Code Hunt and its extensions over its precursor Pex4Fun. Code Hunt represents a high-impact educational gaming platform that not only internally leverages fitness values to guide test/clue generation but also externally offers fun user experiences where search-based test generation is manually emulated. Because the amount of data is growing all the time, the entire system runs in the cloud on Windows Azure.
    • Arno Puder, Nikolai Tillmann, and Michal Moskal, Exposing Native Device APIs to Web Apps, in Proceedings of the First International Conference on Mobile Software Engineering and Systems, , June 2014
      A recent survey among developers revealed that half plan to use HTML5 for mobile apps in the future. An earlier survey showed that access to native device APIs is the biggest shortcoming of HTML5 compared to native apps. Several different approaches exist to overcome this limitation, among them cross-compilation and packaging the HTML5 as a native app. In this paper we propose a novel approach by using a device-local service that runs on the smartphone and that acts as a gateway to the native layer for HTML5-based apps running inside the standard browser. WebSockets are used for bi-directional communication between the web apps and the device-local service. The service approach is a generalization of the packaging solution. In this paper we describe our approach and compare it with other popular ways to grant web apps access to the native API layer of the operating system.
    • Nikolai Tillmann, Judith Bishop, Nigel Horspool, Daniel Perelman, and Tao Xie, Code Hunt - Searching for Secret Code for Fun, in Proceedings of the 7th International Workshop on Search-Based Software Testing , June 2014
      Code Hunt - Searching for Secret Code for Fun
    • Rafael Auler, Edson Borin, Peli de Halleux, Michal Moskal, and Nikolai Tillmann, Addressing JavaScript JIT engines performance quirks: A crowdsourced adaptive compiler, in Proceedings of Compiler Construction - 23rd International Conference, CC 2014, , April 2014
      JavaScript has long outpaced its original target applications, being used not only for coding complex web clients, but also web servers, game development and even desktop applications. The most appealing advantage of moving applications to JavaScript is its capability to run the same code in a large number of different devices. It is not surprising that many compilers target JavaScript as an intermediate language. However, writing optimizations and analyses passes for a compiler that emits JavaScript is challenging: a long time spent in optimizing the code in a certain way can be excellent for some browsers, but a futile effort for others. For example, we show that applying JavaScript code optimiza- tions in a tablet with Windows 8 and Internet Explorer 11 increased performance by, on average, 5 times, while running in a desktop with Windows 7 and Firefox decreased performance by 20%. Such a scenario demands a radical new solution for the traditional compiler optimiza- tion ow. This paper proposes collecting web clients performance data to build a crowdsourced compiler ag suggestion system in the cloud that helps the compiler perform the appropriate optimizations for each client platform. Since this information comes from crowdsourcing rather than manual investigations, fruitless or harmful optimizations are automatically discarded. Our approach is based on live measurements done while clients use the application on real platforms, proposing a new paradigm on how optimizations are tested.
    • Nikolai Tillmann, Peli de Halleux, Tao Xie, and Judith Bishop, Code Hunt: Gamifying Teaching and Learning of Computer Science at Scale, in Proceedings of the 1st ACM Conference on Learning at Scale, , 2014
      Code Hunt (http://www.codehunt.com/) is an educational coding game (that runs in a browser) for teaching and learning computer science at scale. The game consists of a series of worlds and levels, which get increasingly challenging. In each level, the player has to discover a secret code fragment and write code for it. The game has sounds and a leaderboard to keep the player engaged. Code Hunt targets teachers and students from introductory to advanced programming or software engineering courses. In addition, Code Hunt can be used by seasoned developers to hone their programming skills or by companies to evaluate job candidates. At the core of the game experience is an automated program analysis and grading engine based on dynamic symbolic execution. The engine detects any behavioral differences between the player’s code and the secret code fragment. The game works in any modern browser, and currently supports C# or Java programs. Code Hunt is a dramatic evolution of our earlier Pex4Fun web plat- form, from which we have gathered considerable experience (including over 1.4 million programs submitted by users).

    2013

    • Xusheng Xiao, Sihan Li, Tao Xie, and Nikolai Tillmann, Characteristic Studies of Loop Problems for Structural Test Generation via Symbolic Execution, in Proc. 28th IEEE/ACM International Conference on Automated Software Engineering (ASE 2013), November 2013
      Dynamic Symbolic Execution (DSE) is a state-of-the-art test-generation approach that systematically explores program paths to generate high-covering tests. In DSE, the presence of loops (especially unbound loops) can cause an enormous or even infinite number of paths to be explored. There exist techniques (such as bounded iteration, heuristics, and summarization) that assist DSE in addressing loop problems. However, there exists no literature-survey or empirical work that shows the pervasiveness of loop problems or identifies challenges faced by these techniques on real-world open-source applications. To fill this gap, we provide characteristic studies to guide future research on addressing loop problems for DSE. Our proposed study methodology starts with conducting a literature-survey study to investigate how technical problems such as loop problems compromise automated software-engineering tasks such as test generation, and which existing techniques are proposed to deal with such technical problems. Then the study methodology continues with conducting an empirical study of applying the existing techniques on real-world software applications sampled based on the literature-survey results and major open-source project hostings. This empirical study investigates the pervasiveness of the technical problems and how well existing techniques can address such problems among real-world software applications. Based on such study methodology, our two-phase characteristic studies identify that bounded iteration and heuristics are effective in addressing loop problems when used properly. Our studies further identify challenges faced by these techniques and provide guidelines for effectively addressing these challenges.
    • Nikolai Tillmann, Jonathan de Halleux, Tao Xie, and Judith Bishop, Pex4Fun: A Web-Based Environment for Educational Gaming via Automated Test Generation, in Proc. 28th IEEE/ACM International Conference on Automated Software Engineering (ASE 2013), Tool Demonstrations, November 2013
      Pex4Fun (http://www.pex4fun.com/) is a web-based educational gaming environment for teaching and learning programming and software engineering. Pex4Fun can be used to teach and learn programming and software engineering at many levels, from high school all the way through graduate courses. With Pex4Fun, a student edits code in any browser – with Intellisense – and Pex4Fun executes it and analyzes it in the cloud. Pex4Fun connects teachers, curriculum authors, and students in a unique social experience, tracking and streaming progress updates in real time. In particular, Pex4Fun finds interesting and unexpected input values (with Pex, an advanced test-generation tool) that help students understand what their code is actually doing. The real fun starts with coding duels where a student writes code to implement a teacher’s secret specification (in the form of sample-solution code not visible to the student). Pex4Fun finds any discrepancies in behavior between the student’s code and the secret specification. Such discrepancies are given as feedback to the student to guide how to fix the student’s code to match the behavior of the secret specification. This tool demonstration shows how Pex4Fun can be used in teaching and learning, such as solving coding duels, exploring course materials in feature courses, creating and teaching a course, creating and publishing coding duels, and learning advanced topics behind Pex4Fun.
    • Sihan Li, Tao Xie, and Nikolai Tillmann, A Comprehensive Field Study of End-User Programming on Mobile Devices, in Proc. IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC 2013), September 2013
      TouchDevelop represents a radically new programming environment that enables users to develop mobile applications directly on mobile devices. TouchDevelop has successfully drawn a huge number of end users, who have published thousands of TouchDevelop scripts online. To enhance end-user programming on mobile devices, we conduct the first comprehensive field study of 17322 TouchDevelop scripts and 4275 users. Our study consists of an overall study on the characteristics of scripts (e.g., structural features, code reuse) and users (e.g., expertise), and a longitudinal study on how they evolve over time. Our study results show important characteristics of scripts such as dense external method calls, high code reuse ratio, and also reveal interesting evolution patterns of users. The findings and implications in our study provide valuable guidelines for improving tool support or services for end users and increasing the popularity of end-user programming on mobile devices.
    • Sebastian Burckhardt, Manuel Fahndrich, Peli de Halleux, Jun Kato, Sean McDirmid, Michal Moskal, and Nikolai Tillmann, It's Alive! Continuous Feedback in UI Programming, in PLDI, ACM SIGPLAN, June 2013
      Live programming allows programmers to edit the code of a running program and immediately see the effect of the code changes. This tightening of the traditional edit-compile-run cycle reduces the cognitive gap between program code and behavior, improving the learning experience of beginning programmers while boosting the productivity of seasoned ones. Unfortunately, live programming is difficult to realize in practice as imperative languages lack well-defined abstraction boundaries that make live programming responsive or its feedback comprehensible. This paper enables live programming for user interface programming by cleanly separating the rendering and non-rendering aspects of a UI program, allowing the display to be refreshed on a code change without restarting the program. A type and effect system formalizes this separation and provides an evaluation model that incorporates the code update step. By putting live programming on a more formal footing, we hope to enable critical and technical discussion of live programming systems.
    • Konrad Jamrozik, Gordon Fraser, Nikolai Tillmann, and Jonathan de Halleux, Generating Test Suites with Augmented Dynamic Symbolic Execution, in Proceedings of the 7th International Conference on Tests & Proofs, June 2013
      Unit test generation tools typically aim at one of two objectives: to explore the program behavior in order to exercise automated oracles, or to produce a representative test set that can be used to manually add oracles or to use as a regression test set. Dynamic symbolic execution (DSE) can efficiently explore all simple paths through a program, exercising automated oracles such as assertions or code contracts. However, its original intention was not to produce representative test sets. Although DSE tools like Pex can retain subsets of the tests seen during the exploration, customer feedback revealed that users expect different values than those produced by Pex, and sometimes also more than one value for a given condition or program path. This threatens the applicability of DSE in a scenario without automated oracles. Indeed, even though all paths might be covered by DSE, the resulting tests are usually not sensitive enough to make a good regression test suite. In this paper, we present augmented dynamic symbolic execution, which aims to produce representative test sets with DSE by augmenting path conditions with additional conditions that enforce target criteria such as boundary or mutation adequacy, or logical coverage criteria. Experiments with our APex prototype demonstrate that the resulting test cases can detect up to 30% more seeded defects than those produced with Pex.
    • Tao Xie, Nikolai Tillmann, and Jonathan de Halleux, Educational Software Engineering: Where Software Engineering, Education, and Gaming Meet, in Proc. 3rd International Workshop on Games and Software Engineering (GAS 2013), May 2013
      We define and advocate the subfield of educational software engineering (i.e., software engineering for education), which develops software engineering technologies (e.g., software testing and analysis, software analytics) for general educational tasks, going beyond educational tasks for software engineering. In this subfield, gaming technologies often play an important role together with software engineering technologies. We expect that researchers in educational software engineering would be among key players in the education domain and in the coming age of Massive Open Online Courses (MOOCs). Educational software engineering can and will contribute significant solutions to address various critical challenges in education especially MOOCs such as automatic grading, intelligent tutoring, problem generation, and plagiarism detection. In this position paper, we define educational software engineering and illustrate Pex for Fun (in short as Pex4Fun), one of our recent examples on leveraging software engineering and gaming technologies to address educational tasks on teaching and learning programming and software engineering skills.
    • Nikolai Tillmann, Jonathan De Halleux, Tao Xie, Sumit Gulwani, and Judith Bishop, Teaching and Learning Programming and Software Engineering via Interactive Gaming, in Proc. 35th International Conference on Software Engineering (ICSE 2013), Software Engineering Education (SEE), May 2013
      Massive Open Online Courses (MOOCs) have recently gained high popularity among various universities and even in global societies. A critical factor for their success in teaching and learning effectiveness is assignment grading. Traditional ways of assignment grading are not scalable and do not give timely or interactive feedback to students. To address these issues, we present an interactive-gaming-based teaching and learning platform called Pex4Fun. Pex4Fun is a browser-based teaching and learning environment targeting teachers and students for introductory to advanced programming or software engineering courses. At the core of the platform is an automated grading engine based on symbolic execution. In Pex4Fun, teachers can create virtual classrooms, customize existing courses, and publish new learning material including learning games. Pex4Fun was released to the public in June 2010 and since then the number of attempts made by users to solve games has reached over one million. Our work on Pex4Fun illustrates that a sophisticated software engineering technique – automated test generation – can be successfully used to underpin automatic grading in an online programming system that can scale to hundreds of thousands of users.
    • Tuan A. Nguyen, Christoph Csallner, and Nikolai Tillmann, GROPG: A graphical on-phone debugger, in Proc. 35th ACM/IEEE International Conference on Software Engineering (ICSE), New Ideas and Emerging Results (NIER) track, May 2013
      Debugging mobile phone applications is hard, as current debugging techniques either require multiple computing devices or do not support graphical debugging. To address this problem we present GROPG, the first graphical on-phone debugger. We implement GROPG for Android and perform a preliminary evaluation on third-party applications. Our experiments suggest that GROPG can lower the overall debugging time of a comparable text-based on-phone debugger by up to 2/3.
    • Sihan Li, Tao Xie, and Nikolai Tillmann, A Comprehensive Field Study of End-User Programming on Mobile Devices, no. TR-2013-3, March 2013
      TouchDevelop represents a radically new programming environment that enables users to develop mobile applications directly on mobile devices. TouchDevelop has successfully drawn a huge number of end users, who have published thousands of TouchDevelop scripts online. To enhance end-user programming on mobile devices, we conduct the first comprehensive field study of 17322 TouchDevelop scripts and 4275 users. Our study consists of an overall study on the characteristics of scripts (e.g., structural features, code reuse) and users (e.g., expertise), and a longitudinal study on how they evolve over time. Our study results show important characteristics of scripts such as dense external method calls, high code reuse ratio, and also reveal interesting evolution patterns of users. The findings and implications in our study provide valuable guidelines for improving tool support or services for end users and increasing the popularity of end-user programming on mobile devices.

    2012

    • Nikolai Tillmann, Michal Moskal, Jonathan de Halleux, Manuel Fahndrich, and Sebastian Burckhardt, TouchDevelop — App Development on Mobile Devices, in Proc. 20th International Symposium on Foundations of Software Engineering (FSE 2012), Demonstration, ACM, November 2012
      Mobile devices are becoming the prevalent computing platform for most people. TouchDevelop is a new mobile development environment that enables anyone with a Windows Phone to create new apps directly on the smartphone, without a PC or a traditional keyboard. At the core is a new mobile programming language and editor that was designed with the touchscreen as the only input device in mind. Programs written in TouchDevelop can leverage all phone sensors such as GPS, cameras, accelerometer, gyroscope, and stored personal data such as contacts, songs, pictures. Thousands of programs have already been written and published with TouchDevelop.
    • Tuan Anh Nguyen, Christoph Csallner, and Nikolai Tillmann, GROPG: A graphical on-phone debugger, no. CSE-2012-3, November 2012
      Debugging mobile phone applications is hard, as current debugging techniques either require multiple computing devices or do not support graphical debugging. To address this problem we present GROPG, the first graphical on-phone debugger. We implement GROPG for Android and perform a preliminary evaluation on third-party applications. Our experiments suggest that GROPG can lower the overall debugging time of a comparable text-based on-phone debugger by up to 2/3.
    • Xusheng Xiao, Nikolai Tillmann, Manuel Fahndrich, Jonathan de Halleux, and Michal Moskal, User-Aware Privacy Control via ExtendedStatic-Information-Flow Analysis, in Proc. 27th IEEE/ACM International Conference on Automated Software Engineering (ASE 2012), September 2012
    • Nikolai Tillmann, Michal Moskal, Jonathan de Halleux, Manuel Fahndrich, Judith Bishop, Arjmand Samuel, and Tao Xie, The Future of Teaching Programming is on Mobile Devices, in Proc. 17th Annual Conference on Innovation and Technology in Computer Science Education (ITiCSE 2012), July 2012
      From paper to computers, the way that we have been writing down thoughts and performing symbolic computations has been constantly evolving. Teaching methods closely follow this trend, leveraging existing technology to make teaching more effective and preparing students for their later careers with the available technology. Right now, in 2012, we are in the middle of another technology shift: instead of using PCs and laptops, mobile devices are becoming more prevalent for most everyday computing tasks. In fact, never before in human history were incredibly powerful and versatile computing devices such as smartphones available and adopted so broadly. We propose that computer programming, and thus the teaching of programming, can and should be done directly on the mobile devices themselves, without the need for a separate PC or laptop to write code. Programming on smartphones that we carry around with us at all times means instant gratification for students, as they can show their games and applications to their friends, and it means that students can do their homework or additional practicing at all times. We describe TouchDevelop, a novel mobile programming environment, and call out challenges that need to be overcome and opportunities that it creates.
    • Marat Akhin, Nikolai Tillmann, Manuel Fahndrich, Jonathan de Halleux, and Michal Moskal, Search by Example in TouchDevelop: Code Search Made Easy, in Proceedings of the 2012 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation, IEEE, June 2012
      Code search has always been essential to software development; it is the cornerstone of activities such as program comprehension and maintenance. Traditionally, code search required learning of complex query languages with very steep learning curves. In contrast, programming environments for mobile devices targeting novice programmers are becoming popular and code search is becoming increasingly important. Yet, dedicated code query languages present a learning barrier for novice programmers. In this paper we consider “search-by-example” as a way of dealing with this problem. Given a query code snippet, we find all similar snippets in the codebase and present them to the user. This problem is a special instance of the clone detection problem, and, by using relevant techniques, we can perform precise code search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. These properties make “search-by-example” very easy to use by inexperienced programmers. We built a prototype of our approach in TouchDevelop, a novel mobile app development environment for Windows Phone. We will use it as a testing ground for future evaluation.
    • Tuan A. Nguyen, Sarker T.A. Rumee, Christoph Csallner, and Nikolai Tillmann, An experiment in developing small mobile phone applications comparing on-phone to off-phone development, in Proc. 1st International Workshop on User Evaluation for Software Engineering Researchers (USER), IEEE, June 2012
      TouchDevelop represents a radically new mobile application development model, as TouchDevelop enables mobile application development on a mobile device. I.e., with TouchDevelop, the task of programming say a Windows Phone is shifted from the desktop computer to the mobile phone itself. We describe a first experiment on independent, non-expert subjects to compare programmer productivity using TouchDevelop vs. using a more traditional approach to mobile application development.
    • Nikolai Tillmann, Jonathan de Halleux, Tao Xie, and Judith Bishop, Pex4Fun: Teaching and Learning Computer Science via Social Gaming, in Proc. 24th IEEE-CS Conference on Software Engineering Education and Training (CSEE&T 2012), Tutorial, April 2012
    • Nikolai Tillmann, Michal Moskal, Jonathan de Halleux, Manuel Fahndrich, and Tao Xie, Engage Your Students by Teaching Programming Using Only Mobile Devices with TouchDevelop, in Proc. 24th IEEE-CS Conference on Software Engineering Education and Training (CSEE&T 2012), Tutorial, April 2012
    • Nikolai Tillmann, Michal Moskal, Jonathan de Halleux, Manuel Fahndrich, and Tao Xie, Engage Your Students by Teaching Programming Using Only Mobile Devices with TouchDevelop (abstract only), in Proc. 43rd ACM Technical Symposium on Computer Science Education (SIGCSE 2012), Workshop Summary, February 2012
    • Linghao Zhang, Xiaoxing Ma, Jian Lu, Nikolai Tillmann, Jonathan de Halleux, and Tao Xie, Environment Modeling for Automated Testing of Cloud Applications, in IEEE Software, Special Issue on Software Engineering for Cloud Computing, vol. 29, no. 2, pp. 30–35, 2012
      Recently, cloud computing platforms, such as Microsoft Azure, are available to provide convenient infrastructures such that cloud applications could conduct cloud and data‐intensive computing. To ensure high quality of cloud applications under development, developer testing (also referred to as unit testing) could be used. The behavior of a unit in a cloud application is dependent on the test inputs as well as the state of the cloud environment. Generally, manually providing various test inputs and cloud states for conducting developer testing is time‐consuming and labor‐intensive. To reduce the manual effort, developers could employ automated test generation tools. However, applying an automated test generation tool faces the challenge of generating various cloud states for achieving effective testing, such as achieving high structural coverage of the cloud application since these tools cannot control the cloud environment. To address this challenge, we propose an approach to (1) model the cloud environment for simulating the behavior of the real environment and, (2) apply Dynamic Symbolic Execution (DSE) to both generate test inputs and cloud states to achieve high structural coverage. We apply our approach on some open source Azure cloud applications. The result shows that our approach automatically generates test inputs and cloud states to achieve high structural coverage of the cloud applications.
    • Nikolai Tillmann and Judith Bishop, Teaching programming on a mobile device, in ITiCSE, ACM, 2012
    • Nikolai Tillmann, Jonathan de Halleux, Tao Xie, and Judith Bishop, Teaching and learning computing via social gaming with Pex4Fun (abstract only), in SIGCSE, ACM, 2012
    • Konrad Jamrozik, Gordon Fraser, Nikolai Tillmann, and Jonathan De Halleux, Augmented dynamic symbolic execution, in Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, ACM, New York, NY, USA, 2012
    • Dries Vanoverberghe, Jonathan de Halleux, Nikolai Tillmann, and Frank Piessens, State Coverage: Software Validation Metrics beyond Code Coverage, in Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science , 2012
      Currently, testing is still the most important approach to reduce the amount of software defects. Software quality metrics help to prioritize where additional testing is necessary by measuring the quality of the code. Most approaches to estimate whether some unit of code is sufficiently tested are based on code coverage, which measures what code fragments are exercised by the test suite. Unfortunately, code coverage does not measure to what extent the test suite checks the intended functionality. We propose state coverage, a metric that measures the ratio of state updates that are read by assertions with respect to the total number of state updates, and we present efficient algorithms to measure state coverage. Like code coverage, state coverage is simple to understand and we show that it is effective to measure and easy to aggregate. During a preliminary evaluation on several open-source libraries, state coverage helped to identify multiple unchecked properties and detect several bugs.

    2011

    • Suresh Thummalapenta, Tao Xie, Nikolai Tillmann, Jonathan de Halleux, and Zhendong Su, Synthesizing Method Sequences for High-Coverage Testing, in Proc. ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2011), October 2011
    • Marat Akhin, Nikolai Tillmann, Manuel Fahndrich, Jonathan de Halleux, and Michal Moskal, Code Similarity in TouchDevelop: Harnessing Clones, no. MSR-TR-2011-103, 9 September 2011
      The number of applications available in mobile marketplaces is increasing rapidly. It's very easy to become overwhelmed by the sheer size of their codebase. We propose to use code clone analysis to help manage existing applications and develop new ones. First, we propose an automatic application ranking scheme based on (dis)similarity. Traditionally, applications in app stores are ranked manually, by user or moderator input. We argue that automatically computed (dis)similarity information can be used to reinforce this ranking and help in dealing with possible application cloning. Second, we consider code snippet search, a task commonly performed by application developers. We view it as a special instance of the clone detection problem which allows us to perform precise search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. We built a prototype of our approach in TouchDevelop, a novel application development environment for Windows Phone, and will use it as a testing ground for future evaluation.
    • Xusheng Xiao, Nikolai Tillmann, Manuel Fahndrich, Peli de Halleux, and Michal Moskal, Transparent Privacy Control via Static Information Flow Analysis, no. MSR-TR-2011-93, 2 August 2011
      A common problem faced by modern mobile-device platforms is that third-party applications in the marketplace may leak private information without notifying users. Existing approaches adopted by these platforms provide little information on what applications will do with the private information, failing to effectively assist users in deciding whether to install applications and in controlling their privacy. To address this problem, we propose a transparent privacy control approach, where an automatic static analysis reveals to the user how private information is used inside an application. This flow information provides users with better insights, enabling them to determine when to use anonymized instead of real information, or to force script termination when scripts access private information. To further reduce the user burden in controlling privacy, our approach provides a default setting based on an extended information flow analysis that tracks whether private information is obscured before escaping through output channels. We built our approach into TouchDevelop, a novel application-creation environment that allows users to write application scripts on mobile devices, share them in a web bazaar, and install scripts published by other users. To evaluate our approach, we plan to study a portion of published scripts in order to evaluate the effectiveness and performance of information flow analysis. We also plan to carry out a user survey to evaluate the usability of our privacy control and guide our future design.
    • Kunal Taneja, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux, eXpress: Guided Path Exploration for Efficient Regression Test Generation, in Proc. 2011 International Symposium on Software Testing and Analysis (ISSTA 2011), July 2011
    • Nikolai Tillmann, jhalleux, and Tao Xie, Pex4Fun: Cloud Computing in the CS Classroom, in The Voice of K-12 Computer Science Education and Its Educators, vol. 7, no. 3, pp. 4-5, July 2011
    • Xusheng Xiao, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux, Precise Identification of Problems for Structural Test Generation, in Proc. Proceedings of the 33rd International Conference on Software Engineering (ICSE 2011), May 2011
    • Xusheng Xiao, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux, Covana: Precise Identification of Problems in Pex, in Proc. 33rd International Conference on Software Engineering (ICSE 2011), Demonstration, May 2011
    • Xi Ge, Kunal Taneja, Tao Xie, and Nikolai Tillmann, DyTa: Dynamic Symbolic Execution Guided with Static Verification Results, in Proc. 33rd International Conference on Software Engineering (ICSE 2011), Demonstration, May 2011
    • Nikolai Tillmann, Michal Moskal, Jonathan de Halleux, and Manuel Fahndrich, TouchDevelop - Programming Cloud-Connected Mobile Devices via Touchscreen, no. MSR-TR-2011-49, 15 April 2011
      The world is experiencing a technology shift. In 2011, more touchscreen-based mobile devices like smartphones and tablets will be sold than desktops, laptops, and netbooks combined. In fact, in many cases incredibly powerful and easy-to-use smart phones are going to be the first and, in less developed countries, possibly the only computing devices which virtually all people will own, and carry with them at all times. Furthermore, mobile devices do not only have touchscreens, but they are also equipped with a multitude of sensors, such as location information and acceleration, and they are always connected to the cloud. TouchDevelop is a novel application creation environment for anyone to script their smartphones anywhere -- you do not need a separate PC. TouchDevelop allows you to develop mobile device applications that can access your data, your media, your sensors and allows using cloud services including storage, computing, and social networks. TouchDevelop targets students, and hobbyists, not necessarily the professional developer. Typical TouchDevelop applications are written for fun, or for personalizing the phone. TouchDevelop's typed, structured programming language is built around the idea of only using a touchscreen as the input device to author code. It has built-in primitives which make it easy to access the rich sensor data available on a mobile device. In our vision, the state of the program is automatically distributed between mobile clients and the cloud, with automatic synchronization of data and execution between clients and cloud, liberating the programmer from worrying (or even having to know about) the details. We report on our experience with our first prototype implementation for the Windows Phone 7 platform, which already realizes a large portion of our vision. It is available on the Windows Phone Marketplace.
    • Suresh Thummalapenta, Madhuri Marri, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux, Retrofitting Unit Tests for Parameterized Unit Testing, in Proc. International Conference on Fundamental Approaches to Software Engineering (FASE 2011), March 2011
    • Nikolai Tillmann, Jonathan de Halleux, and Tao Xie, Pex for Fun: Engineering an Automated Testing Tool for Serious Games in Computer Science, no. MSR-TR-2011-41, March 2011
      Although there are various emerging serious games developed for education and training purposes, there exist few serious games for practitioners or students to improve their programming or problem-solving skills in the computer science domain. To provide an open platform for creating serious games in learning computer science, we have developed a web‐based serious gaming environment called Pex for Fun, in short Pex4Fun (http://www.pexforfun.com/), for learning critical computer science skills such as problem solving skills and abstraction skills. It works on any web‐enabled device, even a smart phone. It comes with an auto‐completing code editor, providing a user with instant feedback similar to the code editor in Microsoft Visual Studio. It is a cloud application with the data in the cloud, enabling a user to use it anywhere where an Internet connection is available. New learners of programming can play games there created by us to master basic programming concepts such as arrays, loops, and exception handling. In fact, any user can create new games for others to play.
    • Judith Bishop, Jonathan de Halleux, Nikolai Tillmann, R. Nigel Horspool, Don Syme, and Tao Xie, Browser-based software for technology transfer, in SAICSIT Conf., 2011
    • Nikolai Tillmann, Jonathan de Halleux, and Tao Xie, Pex4Fun: Teaching and learning computer science via social gaming, in CSEE&T, 2011
    • Nikolai Tillmann, Michal Moskal, Jonathan de Halleux, and Manuel Fahndrich, TouchDevelop: programming cloud-connected mobile devices via touchscreen, in Proceedings of the 10th SIGPLAN symposium on New ideas, new paradigms, and reflections on programming and software, ACM, New York, NY, USA, 2011
    • Cristian Cadar, Patrice Godefroid, Sarfraz Khurshid, Corina S. Pasareanu, Koushik Sen, Nikolai Tillmann, and Willem Visser, Symbolic execution for software testing in practice: preliminary assessment, in ICSE, 2011
    • Michael Philippsen, Nikolai Tillmann, and Daniel Brinkers, Double inspection for run-time loop parallelization, in Proceedings of the 24th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2011), 2011
      The Inspector/Executor is well-known for parallelizing loops with irregular access patterns that cannot be analyzed statically. The downsides of existing inspectors are that it is hard to amortize their high run-time overheads by actually executing the loop in parallel, that they can only be applied to loops with dependencies that do not change during their execution and that they are often specifically designed for array codes and are in general not applicable in object oriented just-in-time compilation. In this paper we present an inspector that inspects a loop twice to detect if it is fully parallelizable. It works for arbitrary memory access patterns, is conservative as it notices if changing data dependencies would cause errors in a potential parallel execution, and most importantly, as it is designed for current multicore architectures it is fast - despite of its double inspection effort: it pays off at its first use. On benchmarks we can amortize the inspection overhead and outperform the sequential version from 2 or 3 cores onward.

    2010

    • Tao Xie, Nikolai Tillmann, Jonathan de Halleux, and Wolfram Schulte, Future of Developer Testing: Building Quality in Code, in Proc. FSE/SDP Workshop on the Future of Software Engineering Research (FoSER 2010), November 2010
    • Kiran Lakhotia, Nikolai Tillmann, Mark Harman, and Jonathan de Halleux, FloPSy - Search-Based Floating Point Constraint Solving for Symbolic Execution, in Proceedings of 22nd IFIP International Conference on Testing Software and Systems, Springer Verlag, November 2010
      Recently there has been an upsurge of interest in both, Search–Based Software Testing (SBST), and Dynamic Symbolic Execution (DSE). Each of these two approaches has complementary strengths and weaknesses, making it a natural choice to explore the degree to which the strengths of one can be exploited to offset the weakness of the other. This paper introduces an augmented version of DSE that uses a SBST–based approach to handling floating point computations, which are known to be problematic for vanilla DSE. The approach has been implemented as a plug in for the Microsoft Pex DSE testing tool. The paper presents results from both, standard evaluation benchmarks, and two open source programs.
    • Tao Xie, Jonathan de Halleux, Nikolai Tillmann, and Wolfram Schulte, Teaching and Training Developer-Testing Techniques and Tool Support , in of the 25th Annual ACM Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2010), Educators' and Trainers' Symposium, Reno/Tahoe Nevada, Association for Computing Machinery, Inc., October 2010
    • Lingming Zhang, Tao Xie, Lu Zhang, Nikolai Tillmann, Jonathan de Halleux, and Hong Mei, Test Generation via Dynamic Symbolic Execution for Mutation Testing, in Proceedings of the 26th IEEE International Conference on Software Maintenance, September 2010
    • Rahul Pandita, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux, Guided Test Generation for Coverage Criteria, in Proceedings of the 26th IEEE International Conference on Software Maintenance , September 2010
    • Kunal Taneja, Nuo Li, Madhuri Marri, Tao Xie, and Nikolai Tillmann, MiTV: Multiple-Implementation Testing of User-Input Validators for Web Applications, in Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering , September 2010
    • Jonathan de Halleux and Nikolai Tillmann, Moles: tool-assisted environment isolation with closures, in TOOLS'10, Springer Verlag, July 2010
      Isolating test cases from environment dependencies is often desirable, as it increases test reliability and reduces test execution time. However, code that calls non-virtual methods or consumes sealed classes is often impossible to test in isolation. Moles is a new lightweight framework which addresses this problem. For any .NET method, Moles allows test-code to provide alternative implementations, given as .NET delegates, for which C# provides very concise syntax while capturing local variables in a closure object. Using code instrumentation, the Moles framework will redirect calls to provided delegates instead of the original methods. The Moles framework is designed to work together with the dynamic symbolic execution tool Pex to enable automated test generation. In a case study, testing code programmed against the Microsoft SharePoint Foundation API, we achieved full code coverage while running tests in isolation without an actual SharePoint server. The Moles framework integrates with .NET and Visual Studio.
    • Suresh Thummalapenta, Jonathan Halleux, Nikolai Tillmann, and Scott Wadsworth, DyGen: Automatic Generation of High-Coverage Tests via Mining Gigabytes of Dynamic Traces, in TAP'10, Springer Verlag, July 2010
      Unit tests of object-oriented code exercise particular sequences of method calls. A key problemwhen automatically generating unit tests that achieve high structural code coverage is the selection of relevant method-call sequences, since the number of potentially relevant sequences explodes with the number of methods. To address this issue, we propose a novel approach, called DyGen, that generates tests via mining dynamic traces recorded during program executions. Typical program executions tend to exercise only happy paths that do not include error-handling code, and thus recorded traces often do not achieve high structural coverage. To increase coverage, DyGen transforms traces into parameterized unit tests (PUTs) and uses dynamic symbolic execution to generate new unit tests for the PUTs that can achieve high structural code coverage. In this paper, we show an application of DyGen by automatically generating regression tests on a given version of software.
    • Nikolai Tillmann, Michal Moskal, Wolfram Schulte, Herman Venter, and Manuel Fahndrich, The Unthinkable: Automated Theorem Provers for (Tracing) Just-in-time Compilers, 24 May 2010
      Tracing just-in-time compilers (TJITs) determine frequently executed traces (hot paths and loops) at run time. These traces are then analyzed and optimized, and finally specialized machine code is generated. Up to now, TJITs employed standard compiler construction algorithms to analyze and optimize traces. We propose to leverage automated theorem provers to optimize traces at run time.
    • Wolfram Schulte and Nikolai Tillmann, Automatic Parallelization of Programming Languages: Past, Present and Future, in International Workshop on Multicore Software Engineering, May 2010
    • Margus Veanes, Nikolai Tillmann, and Peli de Halleux, Qex: Symbolic SQL Query Explorer, in LPAR-16, Springer Verlag, April 2010
      We describe a technique and a tool called Qex for generating input tables and parameter values for a given parameterized SQL query. The evaluation semantics of an SQL query is translated into a specific background theory for a satisfiability modulo theories (SMT) solver as a set of equational axioms. Symbolic evaluation of a goal formula together with the background theory yields a model from which concrete tables and values are extracted. We use the SMT solver Z3 in the concrete implementation of Qex and provide an evaluation of its performance.
    • Margus Veanes, Peli de Halleux, and Nikolai Tillmann, Rex: Symbolic Regular Expression Explorer, in Third International Conference on Software Testing, Verification and Validation (ICST) , IEEE, April 2010
      Constraints in form regular expressions over strings are ubiquitous. They occur often in programming languages like Perl and C#, in SQL in form of LIKE expressions, and in web applications. Providing support for regular expression constraints in program analysis and testing has several useful applications. We introduce a method and a tool called Rex, for symbolically expressing and analyzing regular expression constraints. Rex is implemented using the SMT solver Z3, and we provide experimental evaluation of Rex.
    • Michael Bebenita, Florian Brandner, Manuel Fahndrich, Francesco Logozzo, Wolfram Schulte, Nikolai Tillmann, and Herman Venter, SPUR: A Trace-Based JIT Compiler for CIL, no. MSR-TR-2010-27, 25 March 2010
      Tracing just-in-time compilers (TJITs) determine frequently executed traces (hot paths and loops) in running programs and focus their optimization effort by emitting optimized machine code specialized to these traces. Prior work has established this strategy to be especially beneficial for dynamic languages such as JavaScript, where the TJIT interfaces with the interpreter and produces machine code from the JavaScript trace. This direct coupling with a JavaScript interpreter makes it difficult to harness the power of a TJIT for other components that are not written in JavaScript, e.g., the DOM mplementation or the layout engine inside a browser. Furthermore, if a TJIT is tied to a particular high-level language interpreter, it is difficult to reuse it for other input languages as the optimizations are likely targeted at specific idioms of the source language. To address these issues, we designed and implemented a TJIT for Microsoft’s Common Intermediate Language CIL (the target language of C#, VisualBasic, F#, and many other languages). Working on CIL enables TJIT optimizations for any program compiled to this platform. In addition, to validate that the performance gains of a TJIT for JavaScript do not depend on specific idioms of JavaScript that are lost in the translation to CIL, we provide a performance evaluation of our JavaScript runtime which translates JavaScript to CIL and then runs on top of our CIL TJIT.
    • Michael Bebenita, Florian Brandner, Manuel Fahndrich, Francesco Logozzo, Wolfram Schulte, Nikolai Tillmann, and Herman Venter, SPUR: a trace-based JIT compiler for CIL, in Proceedings of the ACM international conference on Object oriented programming systems languages and applications, ACM, New York, NY, USA, 2010

    2009

    • Margus Veanes, Pavel Grigorenko, Peli de Halleux, and Nikolai Tillmann, Symbolic Query Exploration, in ICFEM'09, Springer Verlag, December 2009
      We study the problem of generating a database and parameters for a given parameterized SQL query satisfying a given test condition. We introduce a formal background theory that includes arithmetic, tuples, and sets, and translate the generation problem into a satisfiability or model generation problem modulo the background theory. We use the satisfiability modulo theories (SMT) solver Z3 in the concrete implementation. We describe an application of model generation in the context of the database unit testing framework of Visual Studio.
    • Nuo Li, Tao Xie, Nikolai Tillmann, Jonathan de Halleux, and Wolfram Schulte, Reggae: Automated Test Generation for Programs using Complex Regular Expressions , in Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering , November 2009
    • Margus Veanes, Nikolai Tillmann, and Jonathan de Halleux, Qex: Symbolic SQL Query Explorer, no. MSR-TR-2009-2015, October 2009
      We describe a technique and a tool called Qex for generating input tables and parameter values for a given parameterized SQL query. The evaluation semantics of an SQL query is translated into a specific background theory for a satisfiability modulo theories (SMT) solver as a set of equational axioms. Symbolic evaluation of a goal formula together with the background theory yields a model from which concrete tables and values are extracted. We use the SMT solver Z3 in the concrete implementation of Qex and provide an evaluation of its performance.
    • Margus Veanes, Peli de Halleux, and Nikolai Tillmann, Rex: Symbolic Regular Expression Explorer, no. MSR-TR-2009-137, October 2009
      Constraints in form regular expressions over strings are ubiquitous. They occur often in programming languages like Perl and C#, in SQL in form of LIKE expressions, and in web applications. Providing support for regular expression constraints in program analysis and testing has several useful applications. We introduce a method and a tool called Rex, for symbolically expressing and analyzing regular expression constraints. Rex is implemented using the SMT solver Z3, and we provide experimental evaluation of Rex.
    • Suresh Thummalapenta, Tao Xie, Nikolai Tillmann, Peli de Halleux, and Wolfram Schulte, MSeqGen: Object-Oriented Unit-Test Generation via Mining Source Code, in Proc. 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2009), Association for Computing Machinery, Inc., August 2009
    • Tao Xie, Nikolai Tillmann, Peli de Halleux, and Wolfram Schulte, Fitness-Guided Path Exploration in Dynamic Symbolic Execution, in Proc. the 39th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2009), IEEE, June 2009
      Dynamic symbolic execution is a structural testing technique that systematically explores feasible paths of the program under test by running the program with different test inputs to improve code coverage. To address the space-explosion issue in path exploration, we propose a novel approach called Fitnex, a search strategy that uses state-dependent fitness values (computed through a fitness function) to guide path exploration. The fitness function measures how close an already discovered feasible path is to a particular test target (e.g., covering a not-yet-covered branch). Our new fitness-guided search strategy is integrated with other strategies that are effective for exploration problems where the fitness heuristic fails. We implemented the new approach in Pex, an automated structural testing tool developed at Microsoft Research. We evaluated our new approach by comparing it with existing search strategies. The empirical results show that our approach is effective since it consistently achieves high code coverage faster than existing search strategies.
    • Michael Barnett, Manuel Fahndrich, Francesco Logozzo, Peli de Halleux, and Nikolai Tillmann, Exploiting the Synergy between Automated-Test-Generation and Programming-by-Contract, in Proc. 31st International Conference on Software Engineering (ICSE'2009) , IEEE, May 2009
      This demonstration presents two tools, Code Contracts and Pex, that utilize specification constructs for advanced testing, runtime checking, and static checking of object-oriented .NET programs.
    • Kunal Taneja, Tao Xie, Nikolai Tillmann, Jonathan de Halleux, and Wolfram Schulte, Guided Path Exploration for Regression Test Generation, in Proc. 31th International Conference on Software Engineering (ICSE 2009), IEEE, May 2009
    • Margus Veanes, Pavel Grigorenko, Peli de Halleux, and Nikolai Tillmann, Symbolic Query Exploration, no. MSR-TR-2009-65, May 2009
      We study the problem of generating a database and parameters for a given parameterized SQL query satisfying a given test condition. We introduce a formal background theory that includes arithmetic, tuples, and sets, and translate the generation problem into a satisfiability or model generation problem modulo the background theory. We use the satisfiability modulo theories (SMT) solver Z3 in the concrete implementation. We describe an application of model generation in the context of the database unit testing framework of Visual Studio.
    • Madhuri R Marri, Tao Xie, Nikolai Tillmann, Jonathan de Halleux, and Wolfram Schulte, An Empirical Study of Testing File-System-Dependent Software with Mock Objects, in Proc. 4th International Workshop on Automation of Software Test (AST 2009), Business and Industry Case Studies, IEEE Computer Society, May 2009
      Unit testing is a technique of testing a single unit of a program in isolation. The testability of the unit under test can be reduced when the unit interacts with its environment. The construction of high-covering unit tests and their execution require appropriate interactions with the environment such as a file system or database. To help set up the required environment, developers can use mock objects to simulate the behavior of the environment. In this paper, we present an empirical study to analyze the use of mock objects to test file-system-dependent software. We use a mock object of the FileSystem API provided with the Pex automatic testing tool in our study. We share our insights gained on the benefits of using mock objects in unit testing and discuss the faced challenges.
    • Soonho Kong, Nikolai Tillmann, and Jonathan de Halleux, Automated Testing of Environment-Dependent Programs - A Case Study of Modeling the File System for Pex, in Proc. 6th International Conference on Information Technology: New Generations (ITNG'09), IEEE, April 2009
      Programs that interact with the file system are a classical challenge for automated software testing. A common approach to handling this problem is to insert an abstraction layer between the application and the file system. However, even with a well-defined abstraction layer, the burden on the software developer or tester is still high: they have to understand the subtleties of the file system to craft a meaningful set of test cases. The file system is accessed through a complex API, which often causes developers to overlook obscure yet possible corner cases. In this paper, we present a parameterized model of the file system that can be used in conjunction with Pex, an automated test generation tool, to test code that depends on the file system.
    • Tao Xie, Nikolai Tillmann, Peli de Halleux, and Wolfram Schulte, Mutation Analysis of Parameterized Unit Tests, in Proc. International Conference on Software Testing, Verification and Validation Workshops, 2009. ICSTW '09., IEEE, April 2009
      Recently parameterized unit testing has emerged as a promising and effective methodology to allow the separation of (1) specifying external, black-box behavior (e.g., assumptions and assertions) by developers and (2) generating and selecting internal, white-box test inputs (i.e., high-code-covering test inputs) by tools. A parameterized unit test (PUT) is simply a test method that takes parameters, specifies assumptions on the parameters, calls the code under test, and specifies assertions. The test effectiveness of PUTs highly depends on the way that they are written by developers. For example, if stronger assumptions are specified, only a smaller scope of test inputs than intended are generated by tools, leading to false negatives in terms of fault detection. If weaker assertions are specified, erroneous states induced by the test execution do not necessarily cause assertion violations, leading to false negatives. Detecting these false negatives is challenging since the insufficiently written PUTs would just pass. In this paper, we propose a novel mutation analysis approach for analyzing PUTs written by developers and identifying likely locations in PUTs for improvement. The proposed approach is a first step towards helping developers write better PUTs in practice.
    • Dries Vanoverberghe, Nikolai Tillmann, and Frank Piessens, Test Input Generation for Programs with Pointers, in Proc. Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 15th International Conference, TACAS 2009, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, Springer Verlag, March 2009
      Software testing is an essential process to improve software quality in practice. Researchers have proposed several techniques to automate parts of this process. In particular, symbolic execution can be used to automatically generate a set of test inputs that achieves high code coverage. However, most state-of-the-art symbolic execution approaches cannot directly handle programs whose inputs are pointers, as is often the case for C programs. Automatically generating test inputs for pointer manipulating code such as a linked list or balanced tree implementation remains a challenge. Eagerly enumerating all possible heap shapes forfeits the advantages of symbolic execution. Alternatively, for a tester, writing assumptions to express the disjointness of memory regions addressed by input pointers is a tedious and labor-intensive task. This paper proposes a novel solution for this problem: by exploiting type information, disjointness constraints that characterize permissible configurations of typed pointers in byte-addressable memory can be automatically generated. As a result, the constraint solver can automatically generate relevant heap shapes for the program under test. We report on our experience with an implementation of this approach in Pex, a dynamic symbolic execution framework for .NET. We examine two different symbolic representations for typed memory, and we discuss the impact of various optimizations.
    • Nikolaj Bjørner, Nikolai Tillmann, and Andrei Voronkov, Path Feasibility Analysis for String-Manipulating Programs, in Proc. Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 15th International Conference, TACAS 2009, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, Springer Verlag, 2009
      We discuss the problem of path feasibility for programs manipulating strings using a collection of standard string library functions. We prove results on the complexity of this problem, including its undecidability in the general case and decidability of some special cases. In the context of test-case generation, we are interested in an efficient finite model finding method for string constraints. To this end we develop a two-tier finite model finding procedure. First, an integer abstraction of string constraints are passed to an SMT (Satisfiability Modulo Theories) solver. The abstraction is either unsatisfiable, or the solver produces a model that fixes lengths of enough strings to reduce the entire problem to be finite domain. The resulting fixed-length string constraints are then solved in a second phase. We implemented the procedure in a symbolic execution framework, report on the encouraging results and discuss directions for improving the method further.

    2008

    • Patrice Godefroid, Peli de Halleux, Michael Y. Levin, Aditya V. Nori, Sriram K. Rajamani, Wolfram Schulte, and Nikolai Tillmann, Automated Software Testing Using Program Analysis, in IEEE Software, IEEE Computer Society, October 2008
      During the last ten years, code inspection for standard programming errors has largely been automated with static code analysis. During the next ten years, we believe we will see similar progress in automating testing, and specifically test generation, thanks to advances in program analysis, efficient constraint solvers and powerful computers. We present an overview of several related projects currently under way at Microsoft.
    • Nikolaj Bjorner, Nikolai Tillmann, and Andrei Voronkov, Path Feasibility Analysis for String-Manipulating Programs, no. MSR-TR-2008-153, October 2008
      We discuss the problem of path feasibility for programs manipulating strings using a collection of standard string library functions. We prove results on the complexity of this problem, including its undecidability in the general case and decidability of some special cases. In the context of test-case generation, we are interested in an efficient finite model finding method for string constraints. To this end we develop a two-tier finite model finding procedure. First, an integer abstraction of string constraints are passed to an SMT (Satisfiability Modulo Theories) solver. The abstraction is either unsatisfiable, or the solver produces a model that fixes lengths of enough strings to reduce the entire problem to be finite domain. The resulting fixed-length string constraints are then solved in a second phase. We implemented the procedure in a symbolic execution framework, report on the encouraging results and discuss directions for improving the method further.
    • Tao Xie, Nikolai Tillmann, Peli de Halleux, and Wolfram Schulte, Fitness-Guided Path Exploration in Dynamic Symbolic Execution, no. MSR-TR-2008-123, September 2008
      Dynamic symbolic execution is a structural testing technique that systematically explores feasible paths of the program under test by running the program with different test inputs. Its main goal is to find a set of test inputs that lead to the coverage of particular test targets, e.g., specific state- ments or violated assertions. In theory, it is undecidable whether a test target can be covered, and in practice the number of feasible paths explodes. Nevertheless, for many programs, heuristic search strategies can often cover a test target quickly by analyzing only a few potentially feasible paths. We propose a novel approach called Fitnex, a search strategy that uses state-dependent fitness values (computed through a fitness function) to guide path exploration. The fitness function measures how close an already discovered feasible path is to a particular test target. Our new search strategy gives paths with better fitness values higher priority in the search. As a result, the search needs to consider fewer paths to cover test targets faster. Our new fitness-guided search strategy can be integrated with other strategies that are effective for exploration problems where the fitness heuristic fails. We implemented the new approach in Pex, an automated structural testing tool developed at Microsoft Research for .NET programs. We evaluated the new approach on a set of micro-benchmark programs. The results show that the new approach is effective since it consistently achieves high code coverage faster than existing search strategies.
    • Nikolai Tillmann and Jonathan de Halleux, Pex - White Box Test Generation for .NET, in Proc. of Tests and Proofs (TAP'08), Springer Verlag, Prato, Italy, April 2008
      Pex automatically produces a small test suite with high code coverage for a .NET program. To this end, Pex performs a systematic program analysis (using dynamic symbolic execution, similar to path-bounded model-checking) to determine test inputs for Parameterized Unit Tests. Pex learns the program behavior by monitoring execution traces. Pex uses a constraint solver to produce new test inputs which exercise different program behavior. The result is an automatically generated small test suite which often achieves high code coverage. In one case study, we applied Pex to a core component of the .NET runtime which had already been extensively tested over several years. Pex found errors, including a serious issue.
    • Jonathan de Halleux and Nikolai Tillmann, Parameterized Unit Testing with Pex (Tutorial), in Proc. of Tests and Proofs (TAP'08), Springer Verlag, Prato, Italy, April 2008
      This hands-on tutorial will teach the principles of Parameterized Unit Testing [5,4] with Pex [2], an automatic test input generator for .NET which performs a systematic program analysis, similar to path bounded model-checking. A parameterized unit test is simply a method that takes parameters, calls the code under test, and states assertions.
    • Saswat Anand, Patrice Godefroid, and Nikolai Tillmann, Demand-Driven Compositional Symbolic Execution, in Proc. International conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 2008, Springer Verlag, 2008
      We discuss how to perform symbolic execution of large programs in a manner that is both compositional (hence more scalable) and demand-driven. Compositional symbolic execution means finding feasible interprocedural program paths by composing symbolic executions of feasible intraprocedural paths. By demand-driven, we mean that as few intraprocedural paths as possible are symbolically executed in order to form an interprocedural path leading to a specific target branch or statement of interest (like an assertion). A key originality of this work is that our demand-driven compositional interprocedural symbolic execution is performed entirely using first-order logic formulas solved with an off-the-shelf SMT (Satisfiability-Modulo-Theories) solver – no procedure in-lining or custom algorithm is required for the interprocedural part. This allows a uniform and elegant way of summarizing procedures at various levels of detail and of composing those using logic formulas. We have implemented a prototype of this novel symbolic execution technique as an extension of Pex, a general automatic testing framework for .NET applications. Preliminary experimental results are encouraging. For instance, our prototype was able to generate tests triggering assertion violations in programs with large numbers of program paths that were beyond the scope of non-compositional test generation.
    • Nikolai Tillmann and Jonathan de Halleux, White-box testing of behavioral web service contracts with Pex, in TAV-WEB '08: Proceedings of the 2008 workshop on Testing, analysis, and verification of web services and applications, Association for Computing Machinery, Inc., New York, NY, USA, 2008
      A web service exposes a public API that can be accessed by potentially hostile clients over the internet. Pex, a white-box test generation tool for .NET, can automatically create test inputs that cover comer cases of a web service implemented in .NET, simulating a malicous attacker.
    • Christoph Csallner, Nikolai Tillmann, and Yannis Smaragdakis, DySy: Dynamic Symbolic Execution for Invariant Inference, in ICSE '08: Proceedings of the 30th international conference on Software engineering, Association for Computing Machinery, Inc., New York, NY, USA, 2008
      Dynamically discovering likely program invariants from concrete test executions has emerged as a highly promising software engineering technique. Dynamic invariant inference has the advantage of succinctly summarizing both "expected" program inputs and the subset of program behaviors that is normal under those inputs. In this paper, we introduce a technique that can drastically increase the relevance of inferred invariants, or reduce the size of the test suite required to obtain good invariants. Instead of falsifying invariants produced by pre-set patterns, we determine likely program invariants by combining the concrete execution of actual test cases with a simultaneous symbolic execution of the same tests. The symbolic execution produces abstract conditions over program variables that the concrete tests satisfy during their execution. In this way, we obtain the benefits of dynamic inference tools like Daikon: the inferred invariants correspond to the observed program behaviors. At the same time, however, our inferred invariants are much more suited to the program at hand than Daikon's hard-coded invariant patterns. The symbolic invariants are literally derived from the program text itself, with appropriate value substitutions as dictated by symbolic execution. We implemented our technique in the DySy tool, which utilizes a powerful symbolic execution and simplification engine. The results confirm the benefits of our approach. In Daikon's prime example benchmark, we infer the majority of the interesting Daikon invariants, while eliminating invariants that a human user is likely to consider irrelevant.
    • Margus Veanes, Colin Campbell, Wolfgang Grieskamp, Wolfram Schulte, Nikolai Tillmann, and Lev Nachmanson, Model-Based Testing of Object-Oriented Reactive Systems with Spec Explorer, in Formal Methods and Testing, vol. 4949, pp. 39-76, Springer Verlag, 2008
      Testing is one of the costliest aspects of commercial software development. Model-based testing is a promising approach addressing these deficits. At Microsoft, model-based testing technology developed by the Foundations of Software Engineering group in Microsoft Research has been used since 2003. The second generation of this tool set, Spec Explorer, deployed in 2004, is now used on a daily basis by Microsoft product groups for testing operating system components, .NET framework components and other areas. This chapter provides a comprehensive survey of the concepts of the tool and their foundations.
    • Dries Vanoverberghe, Nikolaj Bjørner, Jonathan de Halleux, Wolfram Schulte, and Nikolai Tillmann, Using Dynamic Symbolic Execution to Improve Deductive Verification, in Proc. 15th International SPIN Workshop, Springer Verlag, 2008
      One of the most challenging problems in deductive program verification is to find inductive program invariants typically expressed using quantifiers. With strong-enough invariants, existing provers can often prove that a program satisfies its specification. However, provers by themselves do not find such invariants. We propose to automatically generate executable test cases from failed proof attempts using dynamic symbolic execution by exploring program code as well as contracts with quantifiers. A developer can analyze the test cases with a traditional debugger to determine the cause of the error; the developer may then correct the program or the contracts and repeat the process.

    2007

    • Christoph Csallner, Nikolai Tillmann, and Yannis Smaragdakis, DySy: Dynamic Symbolic Execution for Invariant Inference, no. MSR-TR-2007-151, November 2007
      Dynamically discovering likely program invariants from concrete test executions has emerged as a highly promising software engineering technique. Dynamic invariant inference has the advantage of succinctly summarizing both “expected”program inputs and the subset of program behaviors that is normal under those inputs. In this paper, we introduce a technique that can drastically increase the relevance of inferred invariants, or reduce the size of the test suite required to obtain good invariants. Instead of falsifying invariants produced by pre-set patterns, we determine likely program invariants by combining the concrete execution of actual test cases with a simultaneous symbolic execution of the same tests. The symbolic execution produces abstract conditions over program variables that the concrete tests satisfy during their execution. In this way, we obtain the benefits of dynamic inference tools like Daikon: the inferred invariants correspond to the observed program behaviors. At the same time, however, our inferred invariants are much more suited to the program at hand than Daikon’s hardcoded invariant patterns. The symbolic invariants are literally derived from the program text itself, with appropriate value substitutions as dictated by symbolic execution. We implemented our technique in the DySy tool, which utilizes a powerful symbolic execution and simplification engine. The results confirm the benefits of our approach. In Daikon’s prime example benchmark, we infer the majority of the interesting Daikon invariants, while eliminating invariants that a human user is likely to consider irrelevant.
    • Saswat Anand, Patrice Godefroid, and Nikolai Tillmann, Demand-Driven Compositional Symbolic Execution, no. MSR-TR-2007-138, October 2007
      We discuss how to perform symbolic execution of large programs in a manner that is both compositional (hence more scalable) and demand-driven. Compositional symbolic execution means finding feasible interprocedural program paths by composing symbolic executions of feasible intraprocedural paths. By demand-driven, we mean that as few intraprocedural paths as possible are symbolically executed in order to form an interprocedural path leading to a specific target branch or statement of interest (like an assertion). A key originality of this work is that our demand-driven compositional interprocedural symbolic execution is performed entirely using first-order logic formulas solved with an off-the-shelf SMT (Satisfiability-Modulo-Theories) solver – no procedure in-lining or custom algorithm is required for the interprocedural part. This allows a uniform and elegant way of summarizing procedures at various levels of detail and of composing those using logic formulas. This novel symbolic execution technique has been implemented for automatic test input generation in conjunction with Pex, a general automatic testing framework for .NET applications. Preliminary experimental results are enouraging. For instance, our tool was able to generate tests triggering assertion violations in programs with large numbers of program paths that were beyond the scope of non-compositional test generation.
    • Nicolas Kicillof, Wolfgang Grieskamp, Nikolai Tillmann, and Victor Braberman, Achieving both model and code coverage with automated gray-box testing, in A-MOST '07: Proceedings of the 3rd international workshop on Advances in model-based testing, Association for Computing Machinery, Inc., New York, NY, USA, 2007
      We have devised a novel technique to automatically generate test cases for a software system, combining black-box model-based testing with white-box parameterized unit testing. The former provides general guidance for the structure of the tests in the form of test sequences, as well as the oracle to check for conformance of an application under test with respect to a behavioral model. The latter finds a set of concrete parameter values that maximize code coverage using symbolic analysis. By applying these techniques together, we can produce test definitions (expressed as code to be run in a test management framework) that exercise all selected paths in the model, while also covering code branches specific to the implementation. These results cannot be obtained from any of the individual approaches alone, as the model cannot predict what values are significant to a particular implementation, while parameterized unit testing requires manually written test sequences and correctness validations. We provide tool support, integrated into our model-based testing tool.

    2006

    • Nikolai Tillmann and Wolfram Schulte, Unit Tests Reloaded: Parameterized Unit Testing with Symbolic Execution, in IEEE Software, vol. 23, no. 4, pp. 38-47, IEEE, July 2006
      Unit tests are popular, but it is an art to write them in a way that specifies a program’s behavior well and it is laborious to write enough of them to have confidence in the correctness of an implementation. Symbolic execution is an approach that can help. We describe techniques for unit testing based on symbolic execution. These techniques can be used to increase code coverage by finding relevant variations of existing unit tests, and they can be used to generate unit tests from an implementation automatically when no prior unit tests exist. The adoption of symbolic analysis techniques in commercial testing tools has already begun.
    • Nikolai Tillmann, Feng Chen, and Wolfram Schulte, Discovering Likely Method Specifications, no. MSR-TR-2005-146, March 2006
      It is widely accepted that software specifications are of great use for more rigorous software development. They can be used for formal verification and automated testing. They are essential for precise program understanding. But despite their usefulness, specifications often do not exist in practice. This paper describes a new way to automatically infer specifications from code. Given a modifier method and a set of observer methods, we first symbolically execute the modifier method to obtain a set of execution paths. Then, the conditions and final states of the paths are summarized by observer methods. The result is a likely specification of the modifier method that is compact and human-understandable. The inferred specification can be examined by the user, used as input to program verification systems, or as input for test generation tools for validation, We implemented the technique for .NET programs in a tool, called Axiom Meister. Our preliminary experience has been promising. We were able to infer concise specifications for base classes of the .NET platform and found flaws in the design of a new library.
    • Wolfgang Grieskamp, Nikolai Tillmann, and Nicolas Kicillof, Action Machines: a Framework for Encoding and Composing Partial Behaviors, no. MSR-TR-2006-11, February 2006
      We describe action machines, a framework for encoding and composing partial behavioral descriptions. Action machines encode behavior as a variation of labeled transition systems where the labels are observable activities of the described artifact and the states capture full data models. Labels may also have structure, and both labels and states may be partial with a symbolic representation of the unknown parts. Action machines may stem from software models like state machines, scenario machines, or (model) programs, and can be composed in a variety of ways to synthesize new behaviors. The composition operators include synchronized parallel composition, sequential composition, hierarchical composition, and alternating simulation. We use action machines in analysis processes such as model checking and model-based testing. The current main application is in the area of model-based conformance testing, where our approach addresses practical problems users at Microsoft have in applying model-based testing technology.
    • Nikolai Tillmann, Feng Chen, and Wolfram Schulte, Discovering Likely Method Specifications, in Proceedings of the 8th International Conference on Formal Engineering Methods (ICFEM’06), LNCS , Springer Verlag, 2006
      It is widely accepted that software specifications are of great use for more rigorous software development. They can be used for formal verification and automated testing. They are essential for precise program understanding. But despite their usefulness, specifications often do not exist in practice. This paper describes a new way to automatically infer specifications from code. Given a modifier method and a set of observer methods, we first symbolically execute the modifier method to obtain a set of execution paths. Then, the conditions and final states of the paths are summarized by observer methods. The result is a likely specification of the modifier method that is compact and human-understandable. The inferred specification can be examined by the user, used as input to program verification systems, or as input for test generation tools for validation, We implemented the technique for .NET programs in a tool, called Axiom Meister. Our preliminary experience has been promising. We were able to infer concise specifications for base classes of the .NET platform and found flaws in the design of a new library.
    • Nikolai Tillmann and Wolfram Schulte, Mock-object generation with behavior, in Automated Software Engineering, 2006. ASE '06. 21st IEEE/ACM International Conference on, IEEE Computer Society, 2006
      Unit testing is a popular way to guide software development and testing. Each unit test should target a single feature, but in practice it is difficult to test features in isolation. Mock objects are a well-known technique to substitute parts of a program which are irrelevant for a particular unit test. Today mock objects are usually written manually supported by tools that generate method stubs or distill behavior from existing programs. We have developed a prototype tool based on symbolic execution of .NET code that generates mock objects including their behavior by analyzing all uses of the mock object in a given unit test. It is not required that an actual implementation of the mocked behavior exists. We are working towards an integration of our tool into Visual Studio Team System
    • Wolfgang Grieskamp, Nikolai Tillmann, and Wolfram Schulte, XRT- Exploring Runtime for .NET Architecture and Applications, in Electr. Notes Theor. Comput. Sci., vol. 144, no. 3, pp. 3-26, 2006
      XRT – Exploring Runtime – is an exploration framework for programs represented in Microsoft's common intermediate language (CIL). Processing .NET managed assemblies, it provides means for analyzing, rewriting, and executing the rewritten program. Whereas XRT's representation of state allows for arbitrary exploration strategies, it is particularly optimized for transactional exploration, where a transaction may consist of many instruction steps. XRT supports extensions, and one such extension is a module for symbolic exploration which captures the complete domain of safe CIL. Current applications of XRT are in the area of testing, namely parameterized unit testing and state-space exploration for model-based testing. This paper gives an overview of the architecture of XRT and outlines the applications.

    2005

    • Wolfgang Grieskamp, Nikolai Tillmann, Colin Campbell, Wolfram Schulte, and Margus Veanes, Action Machines - Towards a Framework for Model Composition, Exploration and Conformance Testing Based on Symbolic Computation, in Quality Software, 2005. (QSIC 2005). Fifth International Conference on , IEEE Computer Society, September 2005
      We report on a framework for model composition, exploration, and conformance testing based on symbolic execution. The foundations of this framework are given by so-called action machines, a variation of labeled transition systems which communicate over synchronously stepped actions and constraints attached to them. The implementation is based on XRT, an exploration engine for .NET CIL code which supports symbolic computation. The main application is in the area of model-based conformance testing, where our approach addresses practical problems users at Microsoft have in applying model-based testing technology.
    • Wolfgang Grieskamp and Nikolai Tillmann, Action Machines - Towards a Framework for Model Composition, Exploration and Conformance Testing Based on Symbolic Computation, no. MSR-TR-2005-60, May 2005
      We report on a framework for model composition, exploration, and conformance testing based on symbolic execution. The foundations of this framework are given by so-called action machines, a variation of labeled transition systems which communicate over synchronously stepped actions and constraints attached to them. The implementation is based on XRT, an exploration engine for .NET CIL code which supports symbolic computation. The main application is in the area of model-based conformance testing, where our approach addresses practical problems users at Microsoft have in applying model-based testing technology.
    • Colin Campbell, Wolfgang Grieskamp, Lev Nachmanson, Wolfram Schulte, Nikolai Tillmann, and Margus Veanes, Testing Concurrent Object-Oriented Systems with Spec Explorer, in FM, Springer, 2005
      We describe a practical model-based testing tool developed at Microsoft Research called Spec Explorer. Spec Explorer enables modeling and automatic testing of concurrent object-oriented systems. These systems take inputs as well as provide outputs in form of spontaneous reactions, where inputs and outputs can be arbitrary data types, including objects. Spec Explorer is being used daily by several Microsoft product groups. The here presented techniques are used to test operating system components and Web service infrastructure.
    • Margus Veanes, Colin Campbell, Wolfram Schulte, and Nikolai Tillmann, Online testing with model programs, in ESEC/SIGSOFT FSE, ACM, 2005
      Online testing is a technique in which test derivation from a model program and test execution are combined into a single algorithm. We describe a practical online testing algorithm that is implemented in the model-based testing tool developed at Microsoft Research called Spec Explorer. Spec Explorer is being used daily by several Microsoft product groups. Model programs in Spec Explorer are written in the high level specification languages AsmL or Spec#. We view model programs as implicit definitions of interface automata. The conformance relation between a model and an implementation under test is formalized in terms of refinement between interface automata. Testing then amounts to a game between the test tool and the implementation under test.
    • Ana Paiva, Nikolai Tillmann, João C. P. Faria, and Raul F. A. M. Vidal, Modeling and Testing Hierarchical GUIs, in Abstract State Machines, 2005
    • Yuri Gurevich and Nikolai Tillmann, Partial updates, in Theor. Comput. Sci., vol. 336, no. 2-3, pp. 311-342, Springer Verlag, 2005
      A datastructure instance, e.g. a set or file or record, may be modified independently by different parts of a computer system. The modifications may be nested. Such hierarchies of modifications need to be efficiently checked for consistency and integrated. This is the problem of partial updates in a nutshell. In our first paper on the subject, we developed an algebraic framework which allowed us to solve the partial update problem for some useful datastructures including counters, sets and maps. These solutions are used for the efficient implementation of concurrent data modifications in the specification language AsmL. The two main contributions of this paper are (i)~a more general algebraic framework for partial updates and (ii)~a solution of the partial update problem for sequences and labeled ordered trees.
    • Nikolai Tillmann and Wolfram Schulte, Parameterized Unit Tests with Unit Meister, in Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Association for Computing Machinery, Inc., 2005
      Parameterized unit tests extend the current industry practice of using closed unit tests defined as parameterless methods. Traditional closed unit tests are re-obtained by instantiating the parameterized unit tests. We have developed the prototype tool Unit Meister, which uses symbolic execution and constraint solving to automatically compute a minimal set of inputs that exercise a parameterized unit test given certain coverage criteria. In addition, the parameterized unit tests can be used as symbolic summaries during symbolic execution, which allows our approach to scale for arbitrary abstraction levels. Unit Meister has a command-line interface, and is also integrated into Visual Studio 2005 Team System.
    • Nikolai Tillmann and Wolfram Schulte, Parameterized Unit Tests, in Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Association for Computing Machinery, Inc., 2005
      Parameterized unit tests extend the current industry practice of using closed unit tests defined as parameterless methods. Parameterized unit tests separate two concerns: 1) They specify the external behavior of the involved methods for all possible test arguments. 2) Test cases can be re-obtained as traditional closed unit tests by instantiating the parameterized unit tests. Symbolic execution and constraint solving can be used to automatically choose a minimal set of inputs that exercise a parameterized unit test with respect to possible code paths of the implementation. In addition, parameterized unit tests can be used as symbolic summaries which allows symbolic execution to scale for arbitrary abstraction levels. We have developed a prototype tool which computes test cases from parameterized unit tests; it is integrated into the forthcoming Visual Studio Team System product. We report on its first use to test parts of the .NET base class library.
    • Ana Paiva, João C. P. Faria, Nikolai Tillmann, and Raul F. A. M. Vidal, A Model-to-Implementation Mapping Tool for Automated Model-Based GUI Testing, in ICFEM, Springer Verlag, 2005
      This paper presents extensions to Spec Explorer to automate the testing of software applications through their GUIs based on a formal specification in Spec . Spec Explorer, a tool developed at Microsoft Research, already supports automatic generation and execution of test cases for API testing, but requires that the actions described in the model are bound to methods in a .Net assembly. The tool described in this paper extends Spec Explorer to automate GUI testing: it adds the capability to gather information about the physical GUI objects that are the target of the user actions described in the model; and it automatically generates a .Net assembly with methods that simulate those actions upon the GUI application under test. The GUI modelling and the overall test process supported by these tools are described. The approach is illustrated with the Notepad application.

    2004

    • Mike Barnett, Wolfgang Grieskamp, Lev Nachmanson, Wolfram Schulte, Nikolai Tillmann, and Margus Veanes, Towards a Tool Environment for Model-Based Testing with AsmL, in FATES 2003, Springer Verlag, 2004
      We present work on a tool environment for model-based testing with the Abstract State Machine Language (AsmL). Our environment supports semi-automatic parameter generation, call sequence generation and conformance testing. We outline the usage of the environment by an example, discuss its underlying technologies, and report on some applications conducted in the Microsoft environment.
    • Lev Nachmanson, Margus Veanes, Wolfram Schulte, Nikolai Tillmann, and Wolfgang Grieskamp, Optimal strategies for testing nondeterministic systems, in ISSTA 2004, ACM, 2004
      This paper deals with testing of nondeterministic software systems. We assume that a model of the nondeterministic system is given by a directed graph with two kinds of vertices: states and choice points. Choice points represent the nondeterministic behavior of the implementation under test (IUT). Edges represent transitions. They have costs and probabilities. Test case generation in this setting amounts to generation of a game strategy. The two players are the testing tool (TT) and the IUT. The game explores the graph. The TT leads the IUT by selecting an edge at the state vertices. At the choice points the control goes to the IUT. A game strategy decides which edge should be taken by the TT in each state. This paper presents three novel algorithms 1) to determine an optimal strategy for the bounded reachability game, where optimality means maximizing the probability to reach any of the given final states from a given start state while at the same time minimizing the costs of traversal; 2) to determine a winning strategy for the bounded reachability game, which guarantees that given final vertices are reached, regardless how the IUT reacts; 3) to determine a fast converging edge covering strategy, which guarantees that the probability to cover all edges quickly converges to 1 if TT follows the strategy.
    • Wolfgang Grieskamp, Nikolai Tillmann, and Margus Veanes, Instrumenting Scenarios in a Model-Driven Development Environment, in Journal of Information and Software Technology, vol. 46, no. 15, pp. 1027-1036, Elsevier , 2004
      SpecExplorer is an integrated environment for modeldriven development of .NET software. In this paper we discuss how scenarios can be described in SpecExplorer’s modeling language, Spec#, and how the SpecExplorer tool can be used to validate those scenarios by various means.

    2003

    • Mike Barnett, Wolfgang Grieskamp, Clemens Kerer, Wolfram Schulte, Clemens Szyperski, Nikolai Tillmann, and Arthur Watson, Serious Specification for Composing Components, in Proceedings of the 6th ICSE Workshop on Component-Based Software Engineering: Automated Reasoning and Prediction, May 2003
      We discuss the use of an industrial-strength specification language to specify component-level contracts for a product group within Microsoft. We outline how the specification language evolved to meet the needs of the component-based approach followed by that group. The specification language, AsmL, is executable which allows for testing to be done using runtime verification. Runtime verification dynamically monitors the behavior of a component to ensure that it conforms to its specification.
    • Wolfgang Grieskamp, Lev Nachmanson, Nikolai Tillmann, and Margus Veanes, Test Case Generation from AsmL Specifications, in ASM 2003, Springer Verlag, March 2003
    • Michael Barnett, Wolfram Schulte, and Nikolai Tillmann, Using AsmL for Runtime Verification, in Abstract State Machines, Advances in Theory and Practice, 10th International Workshop, ASM 2003, Springer Verlag, March 2003
      We use AsmL, an executable specification language created by the Foundations of Software Engineering group [1], to describe the behavior of .NET components and perform runtime verification. AsmL is based on the theory of ASMs; AsmL is an industrial-strength specification language that is a full member of the Microsoft .NET Framework.
    • Michael Barnett, Wolfgang Grieskamp, Wolfram Schulte, Nikolai Tillmann, and Margus Veanes, Validating Use-Cases with the AsmL Test Tool, in QSIC, IEEE Computer Society, 2003
      The Abstract State Machine Language supports use-case oriented modeling in a faithful way. In this paper we discuss how the AsmL test tool, a new component of the AsmL tool environment, is used to generate finite state machines from use-case models which can be used for validation purposes or for testing.
    • Mike Barnett, Wolfram Schulte, and Nikolai Tillmann, .NET Contracts: Attaching Specifications to Components, in Practical Foundations of Business System Specifications, pp. 83–98, Springer Verlag, 2003
      We use AsmL to specify classes and interfaces for .NET components. The AsmL description characterizes precisely the structure and possible behavior of a component. It enables a client to effectively use it without access to the source code. AsmL specifications can also be used to perform the runtime verification of .NET components. Runtime verification dynamically monitors the behavior of a component to ensure that it conforms to its specification. Initial work a Microsoft product team to integrate AsmL technology is under way.
    • Yuri Gurevich and Nikolai Tillmann, Partial Updates Exploration II, in Abstract State Machines, Springer Verlag, 2003
    • Mike Barnett, Wolfgang Grieskamp, Lev Nachmanson, Wolfram Schulte, Nikolai Tillmann, and Margus Veanes, Model-Based Testing with AsmL .NET, in 1st European Conference on Model-Driven Software Engineering, 2003
      We present work on a tool environment for model-based testing with the Abstract State Machine Language (AsmL). Our environment supports semi-automatic parameter generation, call sequence generation and conformance testing. We outline the usage of the environment by an example, discuss its underlying technologies, and report on some applications conducted in the Microsoft environment.

    2001

    • Wolfgang Grieskamp, Markus Lepper, Wolfram Schulte, and Nikolai Tillmann, Testable Use Cases in the Abstract State Machine Language, in Proc. 2nd Asia-Pacific Conference on Quality Software (APAQS 2001), IEEE Computer Society, 2001
      Use cases are a method for describing interactions between humans and/or systems. However, despite their popularity, there is no agreed formal syntax and semantics of use cases. The Abstract State Machine Language (ASML) is an executable specification language developed at Microsoft Research. In this paper we define an encoding of use cases in ASML and demonstrate the advantages by describing techniques to generate test cases and test oracles from the encoding.
    • Yuri Gurevich and Nikolai Tillmann, Partial Updates: Exploration, in J. UCS, vol. 7, no. 11, pp. 917-951, Springer Verlag, 2001
      The partial update problem for parallel abstract state machines has manifested itself in the cases of counters, sets and maps. We propose a solution of the problem that lends itself to an efficient implementation and covers the three cases mentioned above. There are other cases of the problem that require a more general framework.

    Conferences, workshops

    Contact

    Microsoft Research
    One Microsoft Way
    Building: B99, Room 2124
    Redmond, WA 98052-6399, USA

    Email: nikolait at microsoft dot com
    Phone: +1 425 707-6031
    Fax: +1 425 936-7329

    http://research.microsoft.com/~nikolait/
    How to get to MSR, Redmond