FDR

时间:2021-01-31 08:17:09

声明: 网上摘抄

False discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. In a list of rejected hypotheses, FDR controls the expected proportion of incorrectly rejected null hypotheses (type I errors). It is a less conservative procedure for comparison, with greater power than familywise error rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors.

The q value is defined to be the FDR analogue of the p-value. The q-value of an individual hypothesis test is the minimum FDR at which the test may be called significant. One approach is to directly estimate q-values rather than fixing a level at which to control the FDR.

原来q-value是在计算FDR时候使用的,跟P value类似。下面的基本没看懂

Classification of m hypothesis tests

The following table defines some random variables related to the m hypothesis tests.

  # declared non-significant # declared significant Total
# true null hypotheses U V m0
# non-true null hypotheses T S m ? m0
Total m ? R R m

The false discovery rate is given by FDR and one wants to keep this value below a threshold α.

(FDR is defined to be 0 when R = 0)

Controlling procedures

Independent tests

The Simes procedure ensures that its expected value FDR is less than a given α (Benjamini and Hochberg 1995). This procedure is valid when the m tests are independent. Let FDR be the null hypotheses and FDR their corresponding p-values. Order these values in increasing order and denote them by FDR. For a given α, find the largest k such that FDR

Then reject (i.e. declare positive) all H(i) for FDR.

...Note, the mean α for these m tests is FDR which could be used as a rough FDR (RFDR) or "α adjusted for m indep. tests."

NOTE: The RFDR calculation shown here is not part of the Benjamini and Hochberg method.

Dependent tests

The Benjamini and Yekutieli procedure controls the false discovery rate under dependence assumptions. This refinement modifies the threshold and finds the largest k such that:

FDR
  • If the tests are independent: c(m) = 1 (same as above)
  • If the tests are positively correlated: c(m) = 1
  • If the tests are negatively correlated: FDR

In the case of negative correlation, c(m) can be approximated by using the Euler-Mascheroni constant

FDR

Using RFDR above, an approximate FDR (AFDR) is the min(mean α) for m dependent tests = RFDR / ( ln(m)+ 0.57721...).