statsmodels.stats.proportion.test_proportions_2indep

statsmodels.stats.proportion.test_proportions_2indep(count1, nobs1, count2, nobs2, value=None, method=None, compare='diff', alternative='two-sided', correction=True, return_results=True)[source]

Hypothesis test for comparing two independent proportions

This assumes that we have two independent binomial samples.

The Null and alternative hypothesis are

for compare = ‘diff’

  • H0: prop1 - prop2 - value = 0

  • H1: prop1 - prop2 - value != 0 if alternative = ‘two-sided’

  • H1: prop1 - prop2 - value > 0 if alternative = ‘larger’

  • H1: prop1 - prop2 - value < 0 if alternative = ‘smaller’

for compare = ‘ratio’

  • H0: prop1 / prop2 - value = 0

  • H1: prop1 / prop2 - value != 0 if alternative = ‘two-sided’

  • H1: prop1 / prop2 - value > 0 if alternative = ‘larger’

  • H1: prop1 / prop2 - value < 0 if alternative = ‘smaller’

for compare = ‘odds-ratio’

  • H0: or - value = 0

  • H1: or - value != 0 if alternative = ‘two-sided’

  • H1: or - value > 0 if alternative = ‘larger’

  • H1: or - value < 0 if alternative = ‘smaller’

where odds-ratio or = prop1 / (1 - prop1) / (prop2 / (1 - prop2))

Parameters:
count1 : int

Count for first sample.

nobs1 : int

Sample size for first sample.

count2 : int

Count for the second sample.

nobs2 : int

Sample size for the second sample.

value : float

Value of the difference, risk ratio or odds ratio of 2 independent proportions under the null hypothesis. Default is equal proportions, 0 for diff and 1 for risk-ratio and for odds-ratio.

method : string

Method for computing the hypothesis test. If method is None, then a default method is used. The default might change as more methods are added.

diff:

  • ’wald’,

  • ’agresti-caffo’

  • ’score’ if correction is True, then this uses the degrees of freedom

    correction nobs / (nobs - 1) as in Miettinen Nurminen 1985

ratio:

  • ’log’: wald test using log transformation

  • ’log-adjusted’: wald test using log transformation,

    adds 0.5 to counts

  • ’score’: if correction is True, then this uses the degrees of freedom

    correction nobs / (nobs - 1) as in Miettinen Nurminen 1985

odds-ratio:

  • ’logit’: wald test using logit transformation

  • ’logit-adjusted’: wald test using logit transformation,

    adds 0.5 to counts

  • ’logit-smoothed’: wald test using logit transformation, biases

    cell counts towards independence by adding two observations in total.

  • ’score’ if correction is True, then this uses the degrees of freedom

    correction nobs / (nobs - 1) as in Miettinen Nurminen 1985

compare : {'diff', 'ratio' 'odds-ratio'}

If compare is diff, then the hypothesis test is for the risk difference diff = p1 - p2. If compare is ratio, then the hypothesis test is for the risk ratio defined by ratio = p1 / p2. If compare is odds-ratio, then the hypothesis test is for the odds-ratio defined by or = p1 / (1 - p1) / (p2 / (1 - p2)

alternative : {'two-sided', 'smaller', 'larger'}

alternative hypothesis, which can be two-sided or either one of the one-sided tests.

correction : bool

If correction is True (default), then the Miettinen and Nurminen small sample correction to the variance nobs / (nobs - 1) is used. Applies only if method=’score’.

return_results : bool

If true, then a results instance with extra information is returned, otherwise a tuple with statistic and pvalue is returned.

Returns:

results – If return_results is True, then a results instance with the information in attributes is returned. If return_results is False, then only statistic and pvalue are returned.

statisticfloat

test statistic asymptotically normal distributed N(0, 1)

pvaluefloat

p-value based on normal distribution

other attributes :

additional information about the hypothesis test

Return type:

results instance or tuple

Notes

Status: experimental, API and defaults might still change.

More methods will be added.

The current default methods are

  • ‘diff’: ‘agresti-caffo’,

  • ‘ratio’: ‘log-adjusted’,

  • ‘odds-ratio’: ‘logit-adjusted’