statsmodels.stats.multitest.fdrcorrection_twostage

statsmodels.stats.multitest.fdrcorrection_twostage(pvals, alpha=0.05, method='bky', iter=False, is_sorted=False)[source]

(iterated) two stage linear step-up procedure with estimation of number of true hypotheses

Benjamini, Krieger and Yekuteli, procedure in Definition 6

Parameters:
  • pvals (array_like) – set of p-values of the individual tests.
  • alpha (float) – error rate
  • method ({'bky', 'bh')) –

    see Notes for details

    • ’bky’ - implements the procedure in Definition 6 of Benjamini, Krieger
      and Yekuteli 2006
    • ’bh’ - the two stage method of Benjamini and Hochberg
  • iter (bool) –
Returns:

  • rejected (array, bool) – True if a hypothesis is rejected, False if not
  • pvalue-corrected (array) – pvalues adjusted for multiple hypotheses testing to limit FDR
  • m0 (int) – ntest - rej, estimated number of true hypotheses
  • alpha_stages (list of floats) – A list of alphas that have been used at each stage

Notes

The returned corrected p-values are specific to the given alpha, they cannot be used for a different alpha.

The returned corrected p-values are from the last stage of the fdr_bh linear step-up procedure (fdrcorrection0 with method=’indep’) corrected for the estimated fraction of true hypotheses. This means that the rejection decision can be obtained with pval_corrected <= alpha, where alpha is the origianal significance level. (Note: This has changed from earlier versions (<0.5.0) of statsmodels.)

BKY described several other multi-stage methods, which would be easy to implement. However, in their simulation the simple two-stage method (with iter=False) was the most robust to the presence of positive correlation

TODO: What should be returned?