statsmodels.imputation.mice.MICEData.set_imputer

MICEData.set_imputer(endog_name, formula=None, model_class=None, init_kwds=None, fit_kwds=None, predict_kwds=None, k_pmm=20, perturbation_method=None)[source]

Specify the imputation process for a single variable.

Parameters:

endog_name : string

Name of the variable to be imputed.

formula : string

Conditional formula for imputation. Defaults to a formula with main effects for all other variables in dataset. The formula should only include an expression for the mean structure, e.g. use ‘x1 + x2’ not ‘x4 ~ x1 + x2’.

model_class : statsmodels model

Conditional model for imputation. Defaults to OLS. See below for more information.

init_kwds : dit-like

Keyword arguments passed to the model init method.

fit_kwds : dict-like

Keyword arguments passed to the model fit method.

predict_kwds : dict-like

Keyword arguments passed to the model predict method.

k_pmm : int

Determines number of neighboring observations from which to randomly sample when using predictive mean matching.

perturbation_method : string

Either ‘gaussian’ or ‘bootstrap’. Determines the method for perturbing parameters in the imputation model. If None, uses the default specified at class initialization.

Notes

The model class must meet the following conditions:
  • A model must have a ‘fit’ method that returns an object.
  • The object returned from fit must have a params attribute that is an array-like object.
  • The object returned from fit must have a cov_params method that returns a square array-like object.
  • The model must have a predict method.