statsmodels.stats.weightstats.DescrStatsW¶
- class statsmodels.stats.weightstats.DescrStatsW(data, weights=None, ddof=0)[source]¶
Descriptive statistics and tests with weights for case weights
Assumes that the data is 1d or 2d with (nobs, nvars) observations in rows, variables in columns, and that the same weight applies to each column.
If degrees of freedom correction is used, then weights should add up to the number of observations. ttest also assumes that the sum of weights corresponds to the sample size.
This is essentially the same as replicating each observations by its weight, if the weights are integers, often called case or frequency weights.
- Parameters:
- dataarray_like, 1-D or 2-D
dataset
- weights
None
or 1-Dndarray
weights for each observation, with same length as zero axis of data
- ddof
int
default ddof=0, degrees of freedom correction used for second moments, var, std, cov, corrcoef. However, statistical tests are independent of ddof, based on the standard formulas.
Examples
>>> import numpy as np >>> np.random.seed(0) >>> x1_2d = 1.0 + np.random.randn(20, 3) >>> w1 = np.random.randint(1, 4, 20) >>> d1 = DescrStatsW(x1_2d, weights=w1) >>> d1.mean array([ 1.42739844, 1.23174284, 1.083753 ]) >>> d1.var array([ 0.94855633, 0.52074626, 1.12309325]) >>> d1.std_mean array([ 0.14682676, 0.10878944, 0.15976497])
>>> tstat, pval, df = d1.ttest_mean(0) >>> tstat; pval; df array([ 9.72165021, 11.32226471, 6.78342055]) array([ 1.58414212e-12, 1.26536887e-14, 2.37623126e-08]) 44.0
>>> tstat, pval, df = d1.ttest_mean([0, 1, 1]) >>> tstat; pval; df array([ 9.72165021, 2.13019609, 0.52422632]) array([ 1.58414212e-12, 3.87842808e-02, 6.02752170e-01]) 44.0
# if weights are integers, then asrepeats can be used
>>> x1r = d1.asrepeats() >>> x1r.shape ... >>> stats.ttest_1samp(x1r, [0, 1, 1]) ...
- Attributes:
- corrcoef
weighted correlation with default ddof
assumes variables in columns and observations in rows
- cov
weighted covariance of data if data is 2 dimensional
assumes variables in columns and observations in rows uses default ddof
- demeaned
data with weighted mean subtracted
- mean
weighted mean of data
- nobs
alias for number of observations/cases, equal to sum of weights
- std
standard deviation with default degrees of freedom correction
- std_mean
standard deviation of weighted mean
- sum
weighted sum of data
- sum_weights
Sum of weights
- sumsquares
weighted sum of squares of demeaned data
- var
variance with default degrees of freedom correction
Methods
get array that has repeats given by floor(weights)
get_compare
(other[, weights])return an instance of CompareMeans with self and other
quantile
(probs[, return_pandas])Compute quantiles for a weighted sample.
std_ddof
([ddof])standard deviation of data with given ddof
tconfint_mean
([alpha, alternative])two-sided confidence interval for weighted mean of data
ttest_mean
([value, alternative])ttest of Null hypothesis that mean is equal to value.
ttost_mean
(low, upp)test of (non-)equivalence of one sample
var_ddof
([ddof])variance of data given ddof
zconfint_mean
([alpha, alternative])two-sided confidence interval for weighted mean of data
ztest_mean
([value, alternative])z-test of Null hypothesis that mean is equal to value.
ztost_mean
(low, upp)test of (non-)equivalence of one sample, based on z-test
Properties
weighted correlation with default ddof
weighted covariance of data if data is 2 dimensional
data with weighted mean subtracted
weighted mean of data
alias for number of observations/cases, equal to sum of weights
standard deviation with default degrees of freedom correction
standard deviation of weighted mean
weighted sum of data
Sum of weights
weighted sum of squares of demeaned data
variance with default degrees of freedom correction