Microsoft Research announced the six recipients of the Computational Challenges in Synthetic Biology 2006 awards, totaling $570,000 (USD) in funding. The objective of this award is to stimulate foundational research in synthetic biology and DNA nanotechnology by identifying and addressing the unique computational challenges of these areas.
The number of biological parts available for synthetic biology engineering is growing rapidly; in a few years there will likely be thousands of potentially useful parts. The engineer will therefore be confronted with many possible choices, making the design process much more difficult. One solution to this problem is to develop software that when presented with a generic model, will select the appropriate parts to construct the given device. In order to be able to do this, we need to develop a parts description specification suitable for input into design software. In this proposal I wish to investigate formalisms for representing biological entities and to organize a workshop that will bring together the leaders in the field to begin the development of a community wide standard in synthetic biology. In previous work I was one of the designers of the Systems Biology Markup Language (SBML), a standard that has proved to be extremely successful, the aim here is to apply the same mechanism through community consensus based on existing software prototypes and draft proposals.
A major problem facing synthetic biology is the tendency of bacterial systems to eliminate any genes which do not directly benefit the organism. This is a result of natural selection favoring shorter genome lengths, which are able to be replicated more quickly. We propose two advances in computational protein and gene design that directly address the problem.
We have previously demonstrated an algorithm capable of creating the shortest nucleotide sequence that encodes any two given proteins, taking advantage of multiple reading frames and the redundancy of the genetic code. We also have expertise in computational approaches to the redesign of proteins to satisfy particular functions. The current proposal will integrate these technologies in achieving two particular goals. The first involves the interleaving of an antibiotic resistance gene with a particular protein whose expression is desired. Challenging bacteria containing this construct with the appropriate antibiotic will lead to a selective pressure to keep the inserted gene; as the sequence of the protein of interest overlaps this coding sequence, the deletion of the desired protein from the genome will be avoided. In order to maximize the possible overlap, we will use the tools of computational protein design to enumerate a set of conservative mutations that may be allowed without perturbing function. Secondly, we will develop methods to directly reduce the coding length for a given protein, taking a two-step approach. As an initial step, we will redesign a two domain protein consisting of a single polypeptide sequence into a heterodimeric protein complex, using standard methodologies from the field of protein design. Secondly, we will use our sequence optimization methods to overlap the coding sequences of the two components, leading to a substantially reduced length of DNA that codes for a functionally equivalent protein, what we may term, "the world's shortest gene." Our approach integrates protein design, coding sequence optimization, and validation in an experimental context to address a major problem in the long term viability of synthetic biological networks.
We propose BioStudio as a collaborative integrated design environment (IDE) providing editing and revision control for synthetic genomes. Windows SharePoint Services and SQL Server will provide back-end support. The server middle-layer will extend existing open-source visualization-only software (GBrowse) with editing functions, including a BioBricks palette, and will communicate with the back-end using .NET and FrontPage Remote Procedure Calls. Genome sequence and annotations will be stored in XML with an adapter to the Generic Feature Format (GFF) standard. Synthetic biology teams will connect with a web browser client. Development and testing will proceed in conjunction with an in-house project to engineer a synthetic yeast cell that encodes a combinatorial satisfiability experiment to identify a minimal genome. Our physical-level editor is complementary to logical-level design tools such as BioJADE. It provides a multiscale visualization metaphor that enables non-computer-scientist teams to collaborate on documents that are too large for ''Track changes'' and too critical or interdependent to be consigned to a Wiki. Software will be released under a BSD license.
The emerging field of synthetic biology is promoting the development of standard biological components for engineering biological circuits and, ultimately, synthetic cells. Standard DNA components include genes (coding sequences) that are transcribed into cellular macromolecules (proteins), and their flanking promoters (regulatory sequences) that control the activation and deactivation of genes in time and space.
Here, we aim to identify a comprehensive set of standard regulatory sequences using our established model system. This will involve transferring large genome segments from a donor microbial species (H. influenzae or P. aeruginosa) into a different host species (E. coli) and measuring expression of all genes in the new hybrid cells. Using these experimental data, standard regulatory sequences will be identified computationally as motifs shared between equivalent donor and host genes that display robust cross-expression. These sequences will be an important resource of DNA parts for the synthetic biology community and they will have immediate utility in our laboratory�s efforts to reduce to practice the steps for rebuilding and rebooting a microbial genome.
Recently, we developed a technique for self-assembling arbitrary nanoscale shapes and patterns using DNA. This technique, termed "scaffolded DNA origami", allows 100 nanometer diameter shapes to be created with a resolution of 6 nanometers. The technique further allows any shape to be covered by a pattern with over two hundred 6-nanometer features. These features, which can be thought of as pixels, could be carbon nanotubes, quantum dots, protein enzymes, or any of a number of other active nanoscale devices. Thus the technique has great potential to create nanoscale circuits and nanomachines.
Scale-up of DNA origami to create structures of much higher complexity and much greater size is an important problem for almost all DNA origami applications. Currently, the size and complexity of DNA origami is limited by the length of a long single strand of DNA that winds back and forth through every DNA origami structure. Rather than attempting to create larger origami by increasing the length of this strand, we propose to create larger structures by combining origami with what we call a "stacking bond". We believe the specificity of these stacking bonds can be programmed based on the shape of the DNA origami. That is, DNA origami designed to fit together like puzzle pieces should stick together to form the intended larger structures.
By using stacking bonds, we are poised to create a whole new family of DNA origami structures with complexities (in terms of pixels) perhaps ten times that already achieved. Interestingly, stacking bonds are likely to be reversible and so they may enable the creation of complex DNA machines with movable parts, or DNA structures that can expand to many times their original size. We will collaborate with Nanorex Corporation to create software capable of making the complex designs required.
Randomly fluctuating concentrations have been a both a nuisance and a source of fascination in Synthetic Biology � a nuisance by compromising the performance of engineered devices, and a fascination by providing an inside view into the physical conditions in the cell. Such �noise� could be suppressed by negative feedback, but our preliminary mathematical results demonstrate information-theoretic limits to noise suppression and frustration trade-offs where reducing one type of variation inevitably amplifies another. Using a combination of theory, computation, and experiments, we propose to study the principles of noise suppression in the replication control of bacterial plasmids � systems where noise is a nuisance both for synthetic engineers who use plasmids as cloning vectors and for the natural system where fluctuations increase extinction rates. Guided by the models, we will replace or improve the natural replication control systems with synthetically engineered systems built from standardized parts, creating next-generation cloning vectors where both the average and the variation in copy number can be tuned by the engineer during the time course of the experiment.
Significantly improving on nature�s solutions to noise suppression should be possible because plasmid evolution is subject to many restrictions that the engineer is not. The engineered plasmids will be directly evaluated using our new methods for counting molecules in single cells. The basic science aspects of the project form an integral part of a long-term study in the lab, and the cloning vectors will be included as an enabling technology in the Registry of Standard Biological Parts. Our approach is thus synthetic in both senses of the word: We synthesize new control loops from known parts to complement the range of natural systems, and use the results to integrate different aspects of control into a unified theory of negative feedback at the molecular scale.