statsmodels.stats.multitest.fdrcorrection_twostage¶
-
statsmodels.stats.multitest.fdrcorrection_twostage(pvals, alpha=
0.05, method='bky', maxiter=1, iter=None, is_sorted=False)[source]¶ (iterated) two stage linear step-up procedure with estimation of number of true hypotheses
Benjamini, Krieger and Yekuteli, procedure in Definition 6
- Parameters:¶
- pvals : array_like¶
set of p-values of the individual tests.
- alpha : float¶
error rate
- method : {'bky', 'bh')¶
see Notes for details
- ’bky’ - implements the procedure in Definition 6 of Benjamini, Krieger
and Yekuteli 2006
’bh’ - the two stage method of Benjamini and Hochberg
- maxiter : int or bool¶
Maximum number of iterations. maxiter=1 (default) corresponds to the two stage method. maxiter=-1 corresponds to full iterations which is maxiter=len(pvals). maxiter=0 uses only a single stage fdr correction using a ‘bh’ or ‘bky’ prior fraction of assumed true hypotheses. Boolean maxiter is allowed for backwards compatibility with the deprecated
iterkeyword. maxiter=False is two-stage fdr (maxiter=1) maxiter=True is full iteration (maxiter=-1 or maxiter=len(pvals))Added in version 0.14: Replacement for
iterwith additional features.- iter : bool¶
iteris deprecated usemaxiterinstead. If iter is True, then only one iteration step is used, this is the two-step method. If iter is False, then iterations are stopped at convergence which occurs in a finite number of steps (at most len(pvals) steps).Deprecated since version 0.14: Use
maxiterinstead ofiter.
- Returns:¶
rejected (ndarray, bool) – True if a hypothesis is rejected, False if not
pvalue-corrected (ndarray) – pvalues adjusted for multiple hypotheses testing to limit FDR
m0 (int) – ntest - rej, estimated number of true (not rejected) hypotheses
alpha_stages (list of floats) – A list of alphas that have been used at each stage
Notes
The returned corrected p-values are specific to the given alpha, they cannot be used for a different alpha.
The returned corrected p-values are from the last stage of the fdr_bh linear step-up procedure (fdrcorrection0 with method=’indep’) corrected for the estimated fraction of true hypotheses. This means that the rejection decision can be obtained with
pval_corrected <= alpha, wherealphais the original significance level. (Note: This has changed from earlier versions (<0.5.0) of statsmodels.)BKY described several other multi-stage methods, which would be easy to implement. However, in their simulation the simple two-stage method (with iter=False) was the most robust to the presence of positive correlation
TODO: What should be returned?