Computational Ecology and Environmental Science: Technology and Tools

Why do we need new tools?

Society urgently requires accurate predictions of how the living systems of the earth might respond to natural and anthropogenic drivers of change. Unfortunately the models needed to make such predictions don’t currently exist. This is partly because we lack suitable methods and frameworks with which to rigorously research, test and implement predictive models of ecological systems (and partly because we lack sufficient understanding of how such systems function). We therefore aim to provide new technologies and methods (both hardware and software) to help address the key scientific challenges facing ecology and environmental science.

What are the tools that we provide?

Our current efforts focus on developing the core technology and software products summarized below. These are initially being developed for, and tested against, their ability to meet the requirements of our research objectives, and those of our collaborators. Through this process we intend to identify technological and software solutions that will have more widespread applicability and impact. Over the years we have also produced a variety of other experimental prototype tools and gadgets which can be found on our “Ecology Toolbox” webpage.

Multiscale Modelling Engine: a software laboratory for developing ecological models

We are researching a new modelling environment to enable the rapid development, testing and simulation of multi-level models of ecological systems. We intend for the engine to provide a standard framework in which i) models of arbitrary complexity can be developed, ii) processes at multiple spatial and temporal scales can be combined and simulated, iii) model predictions and uncertainty can be quantified, iv) model complexity can be appropriately justified, and v) model structure can be easily visualised and modified. We are researching how best to incorporate and combine these features through testing their utility in our active research projects: in particular our research into predicting the global carbon-climate feedback and predicting global ecosystem structure and function.

Flizbach: Bayesian analysis made easy

Bayesian inference is becoming increasingly popular in ecology and environmental science with the spreading realisation that the formal incorporation and quantification of uncertainty is required to obtain a predictive understanding of ecological systems. We have designed Filzbach to be an easy-to-learn yet robust entry to Bayesian analysis for ecologists who have a basic understanding of elementary programming concepts. Other inference frameworks and approaches are also researched in our Microsoft Research Cambridge lab, such as Infer.NET, and we continue to research the costs and benefits different inference frameworks may have for constructing probabilistic predictive models of ecological systems.

The core of Filzbach is a set of libraries, originally written in C, that estimates model parameters using a Markov Chain Monte-Carlo method, given data, a specified model, and prior parameter distributions. Filzbach is an attractive option for Bayesian inference of ecological models: i) it has a variety of methods that make it relatively straightforward to define parameters and distributions, run Bayesian (as well as Maximum Likelihood) parameter estimation, and assess the model output ii) it is relatively robust to problems with highly uncertain (e.g. “flat”) prior parameter distributions, many parameters, and / or nasty non-linear models; iii) it is written in C and so runs relatively quickly iv) on multi-processor computers it allows multiple Markov-Chains to be computed simultaneously to check for convergence without any extra cost in terms of time; v) it comes with a number of examples of illustrative, very simple problems, and for some real ecological models. We are currently working on finalising the C version of Filzbach, whilst writing a new version in C#.

FetchClimate: A lookup service for climate data

It is increasingly common for ecological and environmental models to incorporate climate data that has been “looked up” from historical climatological records, weather station data, processed gridded data or future climate predictions. However accessing suitable data and processing it in the correct way can be technically demanding, extremely time-consuming and prone to error.

We have developed an in-house FetchClimate service to allow users to rapidly obtain climate data for their needs in the format, and spatiotemporal resolution, that they require. Features of the FetchClimate service are: i) to allow users to request climate data from inside their model code using natural, intuitive commands ii) the ability to make use of large volumes of remotely held climate data; iii) the automated selection of appropriate climate data relative to the user request; iv) the availability of multiple climate data sources to support the FetchClimate requests and automatic selection of the source with the lowest error; v) the reporting of the uncertainty associated with retrieved climate variables; vi) local aching of previous requests to speed up the querying of climate data. Over the coming months and years, we anticipate that FetchClimate will become externally, and will expand into a more general FetchEnvironment service.

Hardware and software for autonomous monitoring

We are researching and developing methods to autonomously monitor vulnerable species and ecosystems. Ranging across sensor networks & environmental sensing, computer vision for ecology to low-power mobile tracking devices, our work focuses on ways to reduce the complexity and overhead in gathering ecological data, while also bringing novel methods and technologies to bear on existing problems. As these technologies become more robust, the volumes of data that scientists are able to gather can grow dramatically. As such, we are also developing methods for visualising and maintaining large datasets, and in the different nature of the questions they allow us to address.

Software for Computational Science

Alongside our ecology-specific projects we are contributing to more general research and development into software for computational science. To date these efforts have principally been focussed on the following tools

Scientific DataSet

A set of .NET libraries that makes it easy for those writing .NET code (e.g. models of ecological systems) to read, write and share data. The libraries can input and output data in a variety of formats that are commonly used in ecology and environmental science (e.g. CSV, NetCDF).

Dataset Viewer

Facilitates the visualisation and exploration of data. The tool can read data in a variety of formats (e.g. CSV, NetCDF), allows the user to visualise data in a variety of common formats (line graphs, surface plots, maps), enables the visualisation and recording of dynamic data, and allows the user to save standard views of the data for later access.

Computational Science Studio

A prototype software environment for scientific research allowing users to handle data visualisation and processing, modelling and workflow management within a common software framework. In 2009 we produced a prototype Earth System Model using this framework. Using in part the lessons learned from this project we are now esearching and developing new frameworks to improve the clarity, extensibility, and pace of development of models of complex dynamical systems.