On Thursday, October 15, 2009, in collaboration with Carnegie Mellon University, Microsoft Research hosted four tutorials that are designed to facilitate scientific discovery by extending the reach and utility of eScience efforts. Below are brief descriptions of the eScience research technologies presented in these tutorial sections, followed by links to the full video presentations.
Session M1: Project Trident: A Scientific Workflow Workbench
Dean Guo and Yan Xu, Microsoft Research
Keith Grochow, University of Washington
Workflow is a major component of almost every major science project today, covering a wide range of domains. Project Trident leverages Windows Workflow Foundation (WWF)—which is part of the Microsoft .NET Framework—for core workflow support and implements only the functionality required for scientific workflow. Project Trident will be an open source scientific workflow workbench. Currently, the binary is available for free download at Project Trident: A Scientific Workflow Workbench.
This tutorial describes Project Trident architecture, design, and key features. It helps you use the Trident scientific workflow workbench to accomplish the following:
- Author workflows and customize workflow activities
- Manage workflows on a desktop and scale out to Windows HPC (High Performance Computing) clusters
- Provide runtime services, such as provenance and workflow runtime monitoring
- Manage workflow versioning and personalized workflow catalog
- Expose Project Trident runtime as a Web service and run workflows from a Microsoft Silverlight-enabled browser or other application (such as Microsoft Office Word) for reproducible research
- Use myExperiment as a portal for sharing workflows
- Use Microsoft Project Trident Connection Point to request features, bug fixes, and share best practices.
Session M2: Microsoft Cloud Computing Frameworks for Research
Jared Jackson and Christophe Poulain, Microsoft Research
Simon Woodman and Hugo Hiden, Newcastle University
Computing-enabled scientific and engineering research has emerged as the third pillar of the scientific process, complementing theory and experiment. The challenge of satisfying the ever-rising demand for research computing and data management—the enabler of scientific discovery continues to grow. Fortuitously, the emergence of cloud computing—software and services hosted by networks of commercial data centers and accessible over the Internet—offers a solution to this conundrum.
This tutorial describes cloud technologies that broaden the scientific and research community’s access to data and compute-intensive resources. We start by providing an overview of cloud computing today. Then we examine cloud application frameworks by looking at Dryad and DryadLINQ and Microsoft Azure. Throughout the tutorial, scientific examples illustrate the potential applications.
Session A1: Tools to Support e-Research – Microsoft Research and the Scholarly Information Ecosystem
Oscar Naim and Lee Dirks, Microsoft Research
Microsoft External Research strongly supports the process of research and its role in the innovation ecosystem, including developing and supporting efforts in open access, open tools, open technology, and interoperability. Microsoft External Research collaborates with universities, national libraries, publishers, and governmental organizations to help develop tools and services to evolve the scholarly information lifecycle. These projects demonstrate our ongoing work towards producing next-generation documents that increase productivity and empower authors to increase the discoverability and appropriate re-use of their work.
This workshop provides a deep view into several freely available tools from Microsoft External Research and demonstrates how they can help supplement and enhance your e-research. The hands-on component of this session helps you gain a deeper technical understanding of the available toolset, which includes the following resources:
- Research Information Centre (RIC): An online virtual research environment for collaborative work
- Tools for authors
- Structured document authoring (based on the NLM-DTD)
- Ontology integration and markup
- Repository search integration
- ORE resource map authoring
- Article repository submission workflow (via REST and SWORD interfaces)
- Zentity: A research-output repository platform
- Version 1.0 is available
- Other related services
- Bing Translator (Web service)
- Document/file format conversion (Web service)
Session A2: The Microsoft Biology Initiative: An Open Source Framework and Toolset for Bioinformatics Research
Michael Zyskowski, Simon Mercer, and Jared Jackson, Microsoft Research
Chris Wu, Carnegie Mellon University
Jim Hogan and Lawrence Buckingham, Queensland University of Technology
Jaroslaw Pillardy and Robert Bukowski, Cornell University
The Microsoft Biology Initiative (MBI) has two distinct components: The Microsoft Biology Framework (MBF) and the Microsoft Biology Toolset (MBT). MBF is a set of Microsoft .NET assemblies implementing file parsers and writers for common formats, common algorithms, and access to a set of common Web services used in bioinformatics—specifically in the domains of DNA sequencing, assembly, annotation, and analysis. It is a framework that provides a common object model for the representation, analysis, and visualization of DNA, RNA, and protein sequences. MBT is a set of tools, some of which are already built upon MBF, for directed scientific analysis and discovery of relationships that are related to the human genome. The tools and framework comprise a set of extensible, open source technologies that will enable Microsoft and third parties to conduct rich genomic science research on the Windows platform. This workshop is intended to introduce the audience to the foundational components of the MBF; show how the framework can be extended to address specific scientific analysis problems; and to provide an overview of the underlying code, which can be extended into areas not yet addressed via the open-source nature of the project.