Oliver Stegle, Leopold Parts, Richard Durbin, and John Winn
6 May 2010
Gene expression measurements are influenced by a wide range of factors, such as the state of the cell, experimental conditions and variants in the sequence of regulatory regions. To understand the effect of a variable of interest, such as the genotype of a locus, it is important to account for variation that is due to confounding causes. Here, we present VBQTL, a probabilistic approach for mapping expression quantitative trait loci (eQTLs) that jointly models contributions from genotype as well as known and hidden confounding factors. VBQTL is implemented within an efficient and flexible inference framework, making it fast and tractable on large-scale problems. We compare the performance of VBQTL with alternative methods for dealing with confounding variability on eQTL mapping datasets from simulations, yeast, mouse, and human. Employing Bayesian complexity control and joint modelling is shown to result in more precise estimates of the contribution of different confounding factors resulting in additional associations to measured transcript levels compared to alternative approaches. We present a threefold larger collection of cis eQTLs than previously found in a whole-genome eQTL scan of an outbred human population. Altogether, 27% of the tested probes show a significant genetic association in cis, and we validate that the additional eQTLs are likely to be real by replicating them in different sets of individuals. Our method is the next step in the analysis of high-dimensional phenotype data, and its application has revealed insights into genetic regulation of gene expression by demonstrating more abundant cis-acting eQTLs in human than previously shown. Our software is freely available online at http://www.sanger.ac.uk/resources/software/peer/ .
|Published in||PLoS Computational Biology|
|Publisher||PLoS Computational Biology (Public Library of Science Computational Biology), |
Open Access Everything we publish is freely available online throughout the world, for you to read, download, copy, distribute, and use (with attribution) any way you wish. No permission required. Read a detailed definition of Open Access.
Leopold Parts, Oliver Stegle, John Winn, and Richard Durbin. Joint Genetic Analysis of Gene Expression Data with Inferred Cellular Phenotypes, PLoS Genetics, PLoS, January 2011.
Jim C. Huang, Anitha Kannan, and John M. Winn. Bayesian association of haplotypes and non-genetic factors to regulatory and phenotypic variation in human populations, 2007.
Oliver Stegle, Anitha Kannan, Richard Durbin, and John M. Winn. Accounting for Non-genetic Factors Improves the Power of eQTL Studies, 2008.