Estimating false discovery rates for contingency tables

Jonathan M. Carlson, David Heckerman, and Guy Shani

May 2009

When testing a large number of hypotheses, it can be helpful to estimate or control the false discovery rate (FDR), the expected proportion of tests called significant that are truly null. The FDR is intricately linked to probability that a truly null test is significant, and thus a number of methods have been described that estimate or control the FDR by directly using the p-values of the hypothesis tests. Most of these methods make the assumption that the p-values are uniformly and continuously distributed under the null hypothesis, an assumption that often does not hold for finite data. In this paper, we consider the estimation of FDR for contingency tables. We show how Fisher's exact test can be extended to efficiently calculate the exact null distribution over a set of contingency tables. Using this exact null distribution, we explore the estimation of each of the terms in the FDR estimation, characterize the asymptotic convergence of the estimator, and show how the conservative bias can be reduced by removing certain tests from consideration. The resulting estimator has substantially less conservative bias than traditional approaches.

Publication type | TechReport |

URL | http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/FalseDiscoveryRate/ |

Number | MSR-TR-2009-53 |

Publisher | Microsoft © 2008 Microsoft Corporation. All rights reserved. |

- Large-Sample Learning of Bayesian Networks is Hard
- Learning mixtures of DAG models
- Marked Epitope- and Allele-Specific Differences in Rates of Mutation in Human Immunodeficiency Type 1 (HIV-1) Gag, Pol, and Nef Cytotoxic T-Lymphocyte Epitopes in Acute/Early HIV-1 Infection

> Publications > Estimating false discovery rates for contingency tables