statsmodels.stats.inter_rater.cohens_kappa¶

statsmodels.stats.inter_rater.cohens_kappa(table, weights=None, return_results=True, wt=None)[source]¶

Compute Cohen’s kappa with variance and equal-zero test

Parameters:¶

tablearray_like, 2-Dim

square array with results of two raters, one rater in rows, second rater in columns

weightsarray_like

The interpretation of weights depends on the wt argument. If both are None, then the simple kappa is computed. see wt for the case when wt is not None If weights is two dimensional, then it is directly used as a weight matrix. For computing the variance of kappa, the maximum of the weights is assumed to be smaller or equal to one. TODO: fix conflicting definitions in the 2-Dim case for

wt{None, str}

If wt and weights are None, then the simple kappa is computed. If wt is given, but weights is None, then the weights are set to be [0, 1, 2, …, k]. If weights is a one-dimensional array, then it is used to construct the weight matrix given the following options.

wt in [‘linear’, ‘ca’ or None]use linear weights, Cicchetti-Allison: actual weights are linear in the score “weights” difference
wt in [‘quadratic’, ‘fc’]use linear weights, Fleiss-Cohen: actual weights are squared in the score “weights” difference
wt = ‘toeplitz’weight matrix is constructed as a toeplitz matrix: from the one dimensional weights.

return_resultsbool

If True (default), then an instance of KappaResults is returned. If False, then only kappa is computed and returned.

Returns:¶

results or kappa: If return_results is True (default), then a results instance with all statistics is returned If return_results is False, then only kappa is calculated and returned.

Notes

There are two conflicting definitions of the weight matrix, Wikipedia versus SAS manual. However, the computation are invariant to rescaling of the weights matrix, so there is no difference in the results.

Weights for ‘linear’ and ‘quadratic’ are interpreted as scores for the categories, the weights in the computation are based on the pairwise difference between the scores. Weights for ‘toeplitz’ are a interpreted as weighted distance. The distance only depends on how many levels apart two entries in the table are but not on the levels themselves.

example:

weights = ‘0, 1, 2, 3’ and wt is either linear or toeplitz means that the weighting only depends on the simple distance of levels.

weights = ‘0, 0, 1, 1’ and wt = ‘linear’ means that the first two levels are zero distance apart and the same for the last two levels. This is the sample as forming two aggregated levels by merging the first two and the last two levels, respectively.

weights = [0, 1, 2, 3] and wt = ‘quadratic’ is the same as squaring these weights and using wt = ‘toeplitz’.

References

Wikipedia SAS Manual