Two of the hottest areas of scientific discussion these days are computational science, the intersection between computer science and other sciences, and systems biology, the effort to decipher the code of the human genome.
Andrew Phillips gets to work in both.
Phillips, a scientist who works for Microsoft Research Cambridge, is working with stochastic pi-calculus, a programming language particularly applicable to biological systems.
“There’s been a lot of research in computer science on programming-language theory,” Phillips says, “and a lot of that can be applied to biological modeling.”
The stakes are large. The products of that modeling could provide insights into how biological systems work, and those insights could help in understanding and curing diseases.
One case study on which Phillips is working—one that led to him receiving a couple of recent Medical Research Council grants in collaboration with the United Kingdom’s University of Southampton—involves the simulation of a model of the immune system that scans for the presence of harmful substances inside a cell, much as a computer anti-virus program does.
Such similarities are what led to Phillips’ approach.
“In many ways,” he says, “biological systems are like massively parallel, highly complex, error-prone computer systems.”
One of the approaches taken in the study of systems biology is to build detailed models of systems on a computer and then test such models in a lab environment. Biologists can simulate a range of experiments, explore the most promising, and save precious time and money in the process.
Phillips got involved in this research a couple of years ago. He had worked at Microsoft Research Cambridge as both an intern and a post-doctoral researcher, and he began collaborating with Luca Cardelli, a principal researcher in the Cambridge lab. Things began to blossom from there.
To build biological models, they determined, they needed suitable tools, ones that could scale to large, complex systems. One such tool, they thought, could be a biological programming language.
Plenty of work has been done over the years in designing programming languages suitable for complex, parallel computer systems, not dissimilar to biological systems, and one such language is the stochastic pi-calculus. Phillips’ project builds upon work done at Israel’s Weizmann Institute of Science.
“Because it’s a language,” Phillips says, “you have all the tools associated with a language. And because it’s a mathematical language, you want to be able to do analysis on models, which could be useful for designing biological systems.”
Similarities between complex, parallel systems, computer or biological, raise a couple of interesting challenges.
In some ways, research into the intensely parallel nature of biological systems, in which multiple interactions can occur simultaneously, is pushing the boundaries of concurrent programming, increasingly important as multicore processors move into the mainstream.
Meanwhile, to cope with systems so complex they become unwieldy to analyze, Phillips is using a modular technique, common in computer-systems analysis, that enables new components to be described separately and modified individually without changing the rest of the system.
Modules have become indispensable in computer programming and could have a similar effect on biological modeling.
“You can refine one module without changing all the rest of the system,” Phillips says, “whereas before, everything was kind of interlaced, and it was difficult to extract the behavior of a particular protein. Here, you can say, ‘Let me refine this,’ and then refine it again. That’s the power of modular programming.”
Perhaps so, but biologists, busy working within their own discipline, aren’t likely to have time to comb through the latest advances in computer science in a search for similarities. So Phillips chose to make it easy for them.
One project on which he worked involved developing a graphical representation for the stochastic pi-calculus that would make it more accessible. He and Cardelli, in collaboration with Giuseppe Castagna in Paris, defined a graphical calculus and a graphical execution model that enables graphical and textual representations to be used interchangeably.
“The biologist does not have to write program code, just draw pictures,” Phillips says. “And when the code is executed, you have a graphical execution model.”
More assistance came via the development of a simulation algorithm. Once proved correct, the algorithm was incorporated into functional program code, resulting in the Stochastic Pi Machine (SPiM), available for download. The simulator was written in F#, the premier functional programming language for .NET, providing rapid implementation.
“We’ve done the first provably correct simulation algorithm for this,” Phillips notes, “which is very important if you want to get accurate simulation results.”
Pharmaceutical companies are intensely interested in this work. Biological modeling would be of great help in developing and testing drugs and detecting problems with them before they are released to the public.
“Your immune system is like an information-processing system,” Phillips says, “and it works like a program. If you want to design something to perform a specific function, you need to have the design tools to do that.”
More work remains, of course. Phillips is looking ahead to 3-D visualizations that could enable scientists to see systems evolve dynamically; more robust prototypes; and, through his Medical Research Council grants with the University of Southampton, “making a contribution to medical research.”
Phillips, you see, takes this eScience stuff seriously.
“Working with biologists,” he says, “we get great feedback on what they like and what they don’t like, so we know we’re on the right track.
“Computer scientists tend to focus on the computer science and not really worry about how other sciences perceive things or how other disciplines are able to use their results. So being able to bridge that gap … I’m quite happy.”