Estimating false discovery rates for contingency tables

Jonathan M. Carlson, David Heckerman, and Guy Shani

May 2009

When testing a large number of hypotheses, it can be helpful to estimate or control the false discovery rate (FDR), the expected proportion of tests called significant that are truly null. The FDR is intricately linked to probability that a truly null test is significant, and thus a number of methods have been described that estimate or control the FDR by directly using the p-values of the hypothesis tests. Most of these methods make the assumption that the p-values are uniformly and continuously distributed under the null hypothesis, an assumption that often does not hold for finite data. In this paper, we consider the estimation of FDR for contingency tables. We show how Fisher's exact test can be extended to efficiently calculate the exact null distribution over a set of contingency tables. Using this exact null distribution, we explore the estimation of each of the terms in the FDR estimation, characterize the asymptotic convergence of the estimator, and show how the conservative bias can be reduced by removing certain tests from consideration. The resulting estimator has substantially less conservative bias than traditional approaches.

Publication type | TechReport |

URL | http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/FalseDiscoveryRate/ |

Number | MSR-TR-2009-53 |

Publisher | Microsoft © 2008 Microsoft Corporation. All rights reserved. |

- Asymptotic model selection for directed networks with hidden variables
- Computationally efficient methods for selecting among mixtures of graphical models, with discussion
- Learning Bayesian Networks is NP-Hard

> Publications > Estimating false discovery rates for contingency tables