statsmodels.stats.nonparametric.samplesize_rank_compare_onetail¶

statsmodels.stats.nonparametric.samplesize_rank_compare_onetail(synthetic_sample, reference_sample, alpha, power, nobs_ratio=1, alternative='two-sided')[source]¶

Compute sample size for the non-parametric Mann-Whitney U test.

This function implements the method of Happ et al (2019).

Parameters:¶

synthetic_samplearray_like: Generated synthetic data representing the treatment group under the research hypothesis.
reference_samplearray_like: Advance information for the reference group.
alphafloat: The type I error rate for the test (two-sided).
powerfloat: The desired power of the test.
nobs_ratiofloat, optional: Sample size ratio, nobs_ref = nobs_ratio * nobs_treat. This is the ratio of the reference group sample size to the treatment group sample size, by default 1 (balanced design). See Notes.
alternativestr, ‘two-sided’ (default), ‘larger’, or ‘smaller’: Extra argument to choose whether the sample size is calculated for a two-sided (default) or one-sided test. See Notes.

Returns:¶

resHolder

An instance of Holder containing the following attributes:

nobs_totalfloat: The total sample size required for the experiment.
nobs_treatfloat: Sample size for the treatment group.
nobs_reffloat: Sample size for the reference group.
relative_effectfloat: The estimated relative effect size.
powerfloat: The desired power for the test.
alphafloat: The type I error rate for the test.

Notes

In the context of the two-sample Wilcoxon Mann-Whitney U test, the reference_sample typically represents data from the control group or previous studies. The synthetic_sample is generated based on this reference data and a prespecified relative effect size that is meaningful for the research question. This effect size is often determined in collaboration with subject matter experts to reflect a significant difference worth detecting. By comparing the reference and synthetic samples, this function estimates the sample size needed to acheve the desired power at the specified Type-I error rate.

Choosing between one-sided and two-sided tests has important implications for sample size planning. A two-sided test is more conservative and requires a larger sample size but covers effects in both directions. In contrast, a larger (relative_effect > 0.5) or smaller (relative_effect < 0.5) one-sided test assumes the effect occurs only in one direction, leading to a smaller required sample size. However, if the true effect is in the opposite direction, the one-sided test have virtually no power to detect it. Additionally, if a two-sided test ends up being used instead of the planned one-sided test, the original sample size may be insufficient, resulting in an underpowered study. It is important to carefully consider these trade-offs when planning a study.

For nobs_ratio > 1, nobs_ratio = 1, or nobs_ratio < 1, the reference group sample size is larger, equal to, or smaller than the treatment group sample size, respectively.

References

[1]

Happ, M., Bathke, A. C., and Brunner, E. “Optimal sample size planning for the Wilcoxon-Mann-Whitney test”. Statistics in Medicine. Vol. 38(2019): 363-375. https://doi.org/10.1002/sim.7983.

[2]

Thall, P. F., and Vail, S. C. “Some covariance models for longitudinal count data with overdispersion”. Biometrics, pp. 657-671, 1990.