statsmodels.imputation.mice.MICE

class statsmodels.imputation.mice.MICE(model_formula, model_class, data, n_skip=3, init_kwds=None, fit_kwds=None)[source]

Multiple Imputation with Chained Equations.

This class can be used to fit most statsmodels models to data sets with missing values using the ‘multiple imputation with chained equations’ (MICE) approach..

Parameters:
model_formulastr

The model formula to be fit to the imputed data sets. This formula is for the ‘analysis model’.

model_classstatsmodels model

The model to be fit to the imputed data sets. This model class if for the ‘analysis model’.

dataMICEData instance

MICEData object containing the data set for which missing values will be imputed

n_skipint

The number of imputed datasets to skip between consecutive imputed datasets that are used for analysis.

init_kwdsdict-like

Dictionary of keyword arguments passed to the init method of the analysis model.

fit_kwdsdict-like

Dictionary of keyword arguments passed to the fit method of the analysis model.

Examples

Run all MICE steps and obtain results:

>>> imp = mice.MICEData(data)
>>> fml = 'y ~ x1 + x2 + x3 + x4'
>>> mice = mice.MICE(fml, sm.OLS, imp)
>>> results = mice.fit(10, 10)
>>> print(results.summary())
                         Results: MICE
=================================================================
Method:                    MICE       Sample size:           1000
Model:                     OLS        Scale                  1.00
Dependent variable:        y          Num. imputations       10
-----------------------------------------------------------------
           Coef.  Std.Err.    t     P>|t|   [0.025  0.975]  FMI
-----------------------------------------------------------------
Intercept -0.0234   0.0318  -0.7345 0.4626 -0.0858  0.0390 0.0128
x1         1.0305   0.0578  17.8342 0.0000  0.9172  1.1437 0.0309
x2        -0.0134   0.0162  -0.8282 0.4076 -0.0451  0.0183 0.0236
x3        -1.0260   0.0328 -31.2706 0.0000 -1.0903 -0.9617 0.0169
x4        -0.0253   0.0336  -0.7520 0.4521 -0.0911  0.0406 0.0269
=================================================================

Obtain a sequence of fitted analysis models without combining to obtain summary:

>>> imp = mice.MICEData(data)
>>> fml = 'y ~ x1 + x2 + x3 + x4'
>>> mice = mice.MICE(fml, sm.OLS, imp)
>>> results = []
>>> for k in range(10):
>>>     x = mice.next_sample()
>>>     results.append(x)

Methods

combine()

Pools MICE imputation results.

fit([n_burnin, n_imputations])

Fit a model using MICE.

next_sample()

Perform one complete MICE iteration.