> eScience
> Microsoft Computational Biology Web Tools
> Fisher's Exact Test
> Fisher's Exact Test Details
This tool rapidly computes Fisher's exact test of independence for 2x2 contingency tables. Below, we explain how Fisher's exact test can be efficiently computed. For a more detailed description, please refer to the technical report.
Given a 2x2 contingency table, summarizing the co-occurrences of two boolean variables - X, and Y:

Fisher showed that the probability of a table under the null hypothesis that X and Y are independent, follows the hypergeometric distribution:

where teta_X=a+b, teta_Y=a+c, and n=a+b+c+d.
There are two difficulties with the equation above. First, computing factorials for large numbers causes numerical overflow. Second, a direct computation requires 2n operations.
We solve the overflow problem by selectively multiplying and dividing terms from
the equation. That is, whenever the partial result grows beyond 1, we divide by a term
in the denominator, and wherever the partial result shrinks below 1, we multiply by a
term in the numerator.
To reduce the overall runtime we factorize the factorials. As we have argued above,
in many cases the marginals are not balanced. In that case there is a considerable
difference between the minimal and the maximal marginal. If teta_Y = c + d is the
minimal marginal we factorize the computation as follows:

which requires 3(c + d) operations. In many cases, one of the entries of the table significantly
dominates the rest, making the above computation a dramatic improvement
over the straight forward implementation. Algorithm 1 provides pseudo-code for our
approach.

Fisher suggested to compute a p-value by summing the probabilities of all the tables that are "as extreme" as the current table.
A one-sided test sums all the tables that are more extreme in one direction, that is, all the tables consistent with the marginals that have smaller a (Left)

or larger a (Right).

The Fisher two-tailed p-value for a table t=[a,b,c,d] is defined as:

which is the sum of all tables consistent with the marginals that are as likely as the current table.
If we order the permutations by increasing value of a, we can see that, for two tables

, we have

Therefore, computing the probabilities of all the permutations can be done by computing
the probability of the table with the minimal possible a, and then iteratively
computing successor probabilities until reaching the maximal possible a.