Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models

Mindy M. Syfert, Matthew J. Smith, and David A. Coomes


Species distribution models (SDMs) trained on presence-only data are frequently used in ecological research and conservation planning. However, users of SDM software are faced with a variety of options, and it is not always obvious how selecting one option over another will affect model performance. Working with MaxEnt software and with tree fern presence data from New Zealand, we assessed whether (a) choosing to correct for geographical sampling bias and (b) using complex environmental response curves have strong effects on goodness of fit. SDMs were trained on tree fern data, obtained from an online biodiversity data portal, with two sources that differed in size and geographical sampling bias: a small, widely-distributed set of herbarium specimens and a large, spatially clustered set of ecological survey records. We attempted to correct for geographical sampling bias by incorporating sampling bias grids in the SDMs, created from all georeferenced vascular plants in the datasets, and explored model complexity issues by fitting a wide variety of environmental response curves (known as “feature types” in MaxEnt). In each case, goodness of fit was assessed by comparing predicted range maps with tree fern presences and absences using an independent national dataset to validate the SDMs. We found that correcting for geographical sampling bias led to major improvements in goodness of fit, but did not entirely resolve the problem: predictions made with clustered ecological data were inferior to those made with the herbarium dataset, even after sampling bias correction. We also found that the choice of feature type had negligible effects on predictive performance, indicating that simple feature types may be sufficient once sampling bias is accounted for. Our study emphasizes the importance of reducing geographical sampling bias, where possible, in datasets used to train SDMs, and the effectiveness and essentialness of sampling bias correction within MaxEnt.


Publication typeArticle
Published inPLOS One

Newer versions

Cory Merow, John A. Silander, and Matthew J. Smith. A practical guide to MaxEnt for modeling species' distributions: what it does, and why inputs and settings matter, Ecography, Wiley, April 2013.

Cory Merow, Matthew J. Smith, Thomas C. Edwards, Antoine Guisan, Sean McMahon, Signe Normand, Wilfried Thuiller, Rafael O. Wuest, Niklaus E. Zimmerman, and Jane Elith. What do we gain from simplicity versus complexity in species distribution models? , Ecography, Wiley, August 2014.

Mindy M Syfert, Lucas N Joppa, Matthew J Smith, David A Coomes, Steven P Bachman, and Neil A Brummitt. Using species distribution models to inform IUCN Red List assessments, Biological Conservation, Elsevier, June 2014.

> Publications > The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models