
statsmodels.stats.nonparametric.samplesize_rank_compare_onetail(synthetic_sample, reference_sample, alpha, power, nobs_ratio=1, alternative='two-sided')[source]

Compute sample size for the non-parametric Mann-Whitney U test.

This function implements the method of Happ et al (2019).


Generated synthetic data representing the treatment group under the research hypothesis.


Advance information for the reference group.


The type I error rate for the test (two-sided).


The desired power of the test.

nobs_ratiofloat, optional

Sample size ratio, nobs_ref = nobs_ratio * nobs_treat. This is the ratio of the reference group sample size to the treatment group sample size, by default 1 (balanced design). See Notes.

alternativestr, ‘two-sided’ (default), ‘larger’, or ‘smaller’

Extra argument to choose whether the sample size is calculated for a two-sided (default) or one-sided test. See Notes.


An instance of Holder containing the following attributes:


The total sample size required for the experiment.


Sample size for the treatment group.


Sample size for the reference group.


The estimated relative effect size.


The desired power for the test.


The type I error rate for the test.


In the context of the two-sample Wilcoxon Mann-Whitney U test, the reference_sample typically represents data from the control group or previous studies. The synthetic_sample is generated based on this reference data and a prespecified relative effect size that is meaningful for the research question. This effect size is often determined in collaboration with subject matter experts to reflect a significant difference worth detecting. By comparing the reference and synthetic samples, this function estimates the sample size needed to acheve the desired power at the specified Type-I error rate.

Choosing between one-sided and two-sided tests has important implications for sample size planning. A two-sided test is more conservative and requires a larger sample size but covers effects in both directions. In contrast, a larger (relative_effect > 0.5) or smaller (relative_effect < 0.5) one-sided test assumes the effect occurs only in one direction, leading to a smaller required sample size. However, if the true effect is in the opposite direction, the one-sided test have virtually no power to detect it. Additionally, if a two-sided test ends up being used instead of the planned one-sided test, the original sample size may be insufficient, resulting in an underpowered study. It is important to carefully consider these trade-offs when planning a study.

For nobs_ratio > 1, nobs_ratio = 1, or nobs_ratio < 1, the reference group sample size is larger, equal to, or smaller than the treatment group sample size, respectively.



Happ, M., Bathke, A. C., and Brunner, E. “Optimal sample size planning for the Wilcoxon-Mann-Whitney test”. Statistics in Medicine. Vol. 38(2019): 363-375.


Thall, P. F., and Vail, S. C. “Some covariance models for longitudinal count data with overdispersion”. Biometrics, pp. 657-671, 1990.

Last update: Oct 21, 2024