The Microsoft Biology Tools (MBT) are a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries. Some of the tools provided here take advantage of the capabilities of .NET Bio (formerly Microsoft Biology Foundation), and are good examples of how you can use .NET Bio to create other tools.
Microsoft Biology Tools
BL!P: BLAST in Pivot
BL!P [blip], also known as BLAST in Pivot, is a tool that automates NCBI BLAST searches, fetches associated GenBank records, and converts this information into a Silverlight PivotViewer collection. Also, BL!P provides a user interface to create customized images for each BLAST match, allowing the user to further customize their data exploration experience.
- Install, download source code and/or participate via the BL!P CodePlex site
Microsoft Research Biology Extension for Excel
The Microsoft Research Biology Extension for Excel is for biologists who are interested in, or already using, Microsoft Excel for their research. The extensible nature of Excel lends itself to the creation of a custom ribbon component labeled “Bioinformatics” in the top menu bar of the application. Through this customization, all the usual features of Excel remain available, but the biologist now has access to additional dedicated bioinformatics features as well. The Biology Extension can also be extended by developers to use other features in .NET Bio. Developers can write custom bioinformatics applications by using the libraries of .NET Bio and can add UI elements for those applications to the Biology Extension.
Note: Biology Extension for Excel code is hosted on the .NET Bio CodePlex site:
Microsoft Research Sequence Assembler
The Microsoft Research Sequence Assembler application is intended for use by biologist and laboratory technicians who are responsible for managing next-generation genomic sequencing data for alignment, assembly, and/or BLAST identification. Though many other full-featured applications provide similar functionality, the primary goal of offering this application is to expose the capabilities available in .NET Bio. When coupled with the rich user interface (UI) elements that the Windows Presentation Foundation supports, the Sequence Assembler provides a unique combination of advanced application interface, controls, and visualization of the data. These features can be useful to scientists, researchers, and clinicians who work with genomic data, but more importantly, the Sequence Assembler is a full-featured sample application that can be extended, modified, and expanded to meet the needs of researchers.
- Install the Sequence Assembler 2.0
- Install the Sequence Assembler 1.0
- Learn more about the Sequence Assembler
Note: Sequence Assembler code is hosted on the .NET Bio CodePlex site:
Microsoft Computational Biology Tools
Bio Model Analyzer
Bio Model Analyzer is a new biological modeling tool that illustrates signaling pathways and determines cellular stabilization. The tool represents a merging of perspectives from systems biology, formal methods, human computer interaction, and design. At one level, Bio Model Analyzer is a sketching tool that enables users to depict a biological system of interest by dragging and dropping cells, their contents, extracellular components, and relationships onto a simple canvas. At another level, Bio Model Analyzer’s analysis proves stabilization of biological systems based upon formal methods that were developed for the specification and verification of properties in concurrent software systems. Find more information, documentation, and download at Bio Model Analyzer.
This tool takes as input a weighted list of amino acid sequences and creates epitomes of all lengths. Install, download source, and/or participate in development at the Create Epitome CodePlex site. A web version of the tool is also available.
DNA Strand Displacement Simulator
DNA Strand Displacement Simulator (DSD) is a new programming language for designing and simulating DNA circuits in which strand displacement is the main computational mechanism. Find more information and a link to the online simulator at A Programming Language for Composable DNA Circuits.
This tool computes the probability that a given kmer is a T-cell epitope restricted to a given HLA allele. Install, download source, and/or participate in development at the Epitope Prediction CodePlex site. A web version of the tool is also available.
False Discovery Rate
This tool estimates the false discovery rate for 2x2 contingency tables, based on Fisher’s statistics. Install, download source, and/or participate in development at the False Discovery Rate CodePlex site. A web version of the tool is also available.
Genetic Engineering of Living Cells
Genetic Engineering of Living Cells (GEC) is a new programming language that allows logical interactions between potentially undetermined proteins and genes to be expressed in a modular manner. Programs can be translated by a compiler into sequences of standard biological parts. Find more information and download a prototype compiler at A Programming Language for Genetic Engineering of Living Cells.
This tool takes lab data from a series of patients and determines probabilistically which HLA genes are responsible for a patient’s reaction. Install, download source, and/or participate in development at the HLA Assignment CodePlex site. A web version of the tool is also available.
This tool takes as input HLA typing data (loci A,B,C) and probabilistically resolves the typing ambiguities. Install, download source, and/or participate in development at the HLA Completion CodePlex site. A web version of the tool is also available.
PhyloD (Phylogeny-Based Association Analysis) is a statistical tool that can identify HIV mutations that defeat the function of the HLA proteins in certain patients, thereby allowing the virus to escape elimination by the immune system. Install, download source, and/or participate in development at the PhyloD CodePlex site. A web version of the tool is also available.
The Stochastic Pi Machine (SPiM) is a simulator for the stochastic pi-calculus that can be used to simulate models of biological systems. The machine has been formally specified, and the specification has been proved correct with respect to the calculus. Find more information, documentation, and downloads at Stochastic Pi Machine.
Synthesizing Biological Theories
This tool enables biologists and modelers to construct high-level theories and models of biological systems, capturing biological hypotheses, inferred mechanisms, and experimental results within the same framework. Among the key features of the tool are convenient ways to represent several competing theories and the interactive nature of building and running the models by using an intuitive, rigorous, scenario-based visual language. Find more information, documentation, and download at Synthesizing Biological Theories.
Other Useful Tools for Researchers
3D Molecule Viewer
3D Molecule Viewer is a stand-alone, demo version of the C-ME application that InterKnowlogy built for the Scripps Research Institute (TSRI). It is a WPF application built in C#. This stand-alone, source code version of the application does not have the Microsoft SharePoint dependency and allows you to open sample 3D Protein Database Format (PDB) files directly, spin them in 3D, zoom in on them, display them from different views, and so forth. Install, download source, and/or participate in development at the 3D Molecule ViewerCodePlex site.
Athena allows biological models to be constructed as modules. Modules can be connected to one another without altering the modules themselves. In addition, Athena houses various tools useful for designing synthetic networks, including tools to perform simulations, automatically derive transcription rate expressions, and view and edit synthetic DNA sequences. Install, download source, and/or participate in development at the Athena CodePlex site.
Chemistry Add-in for Word
The Chemistry Add-in for Word makes it easier to insert and modify chemical information, such as labels, formulas, and 2D depictions, within Microsoft Word. Additionally, it enables the creation of inline “chemical zones,” the rendering of print-ready visual depictions of chemical structures, and the ability to store and expose chemical information in a semantically rich manner. Install, download source, and/or participate in development at Chemistry Add-in for Word.
Infer.NET is a .NET library for machine learning. It provides state-of-the-art algorithms for probabilistic inference from data. Various Bayesian models such as Bayes Point Machine classifiers, TrueSkill matchmaking, hidden Markov models, and Bayesian networks can be implemented using Infer.NET. Find more information and downloads at Infer.NET.
NodeXL is a template for Microsoft Excel 2007 and 2010 that lets you enter a network edge list, click a button, and see the network graph, all in the Excel window. You can easily customize the graph’s appearance; zoom, scale, and pan the graph; dynamically filter vertices and edges; alter the graph’s layout; find clusters of related vertices; and calculate a set of graph metrics. Install, download source, and/or participate in development at the NodeXL CodePlex site.
Ontology Add-in for Word
This Microsoft Word add-in enables the annotation of Word documents based on terms that appear in Ontologies. Install, download source, and/or participate in development at Ontology Add-in for Word.
Pivot makes it easier to interact with massive amounts of data on the web in ways that are powerful, informative, and fun. By visualizing thousands of related items at once, users can see trends and patterns that would be hidden when looking at one item at a time. Pivot is a stand-alone experimental application that was originally released as Microsoft Live Labs Pivot in October 2009 and is now available for download. For production applications, we recommend that users switch to the supported web version of Pivot: Silverlight PivotViewer.
With Project Trident, you can author workflows visually by using a catalog of existing activities and complete workflows. The workflow workbench provides a tiered library that hides the complexity of different workflow activities and services for ease of use. Install, download source, and/or participate in development at Project Trident.
This project focuses on tools for bioinformatics and computational biology; new approaches to visualize sequences and the relationships between them—through SilverMap and SilverGene; and methods for promoter prediction, pattern description languages, and mash-ups. Install, download source, and/or participate in development at the QUT.Bio CodePlex site.
RNA Comparative Analysis Software Tools
This software toolkit helps researchers analyze and visualize biological sequences and structures. The infrastructure consists of a novel database, RNA Comparative Analysis Database (rCAD), and an application for visualization/manipulation of data from rCAD, Comparative Analysis Toolkit User Interface (CATUI). Install, download source, and/or participate in development at the RNA Comparative Analysis Software Tools CodePlex site.
SIGMA: Large Scale Machine Learning Toolkit
The goal of SIGMA is to provide a group of parallel machine-learning algorithms that can meet the requirements of research work and applications, typically with large-scale data or features. The toolkit includes more than 10 algorithms and it makes them run on single multicore machine or on a HPC cluster with hundreds of machines and thousands of CPU cores running. A SDK is provided for researchers and developers who want to invent their own algorithms and add them to the toolkit. Find more information and downloads at SIGMA: Large Scale Machine Learning Toolkit.
WebCell is an integrated simulation environment for managing quantitative and qualitative information on cellular networks, and for interactively exploring their steady-state and dynamic behaviors over the web. A user-friendly web interface allows users to efficiently create, visualize, simulate, and store their reaction network models, thereby facilitating kinetic modeling and simulation of biological systems of interest. Supported analysis methods for such models include, but are not limited to, structural pathway analysis, metabolic control analysis (MCA), conservation analysis, and dynamic simulation. Try it out at WebCell.
WinBioinfTools includes a number of programs for bioinformatics running over Windows Cluster running Windows HPC Server 2008. The current version includes the CoCoNUT system for pairwise genome comparison, parallel global sequence alignment, and parallel BLAST. The modules in this project are adapted to work on Windows Compute Cluster running Windows HPC Server 2008. Install, download source, and/or participate in development at the WinBioinfTools CodePlex site.