statsmodels.stats.outliers_influence.OLSInfluence

class statsmodels.stats.outliers_influence.OLSInfluence(results)[source]

class to calculate outlier and influence measures for OLS result

Parameters:

results : Regression Results instance

currently assumes the results are from an OLS regression

Notes

One part of the results can be calculated without any auxiliary regression (some of which have the _internal postfix in the name. Other statistics require leave-one-observation-out (LOOO) auxiliary regression, and will be slower (mainly results with _external postfix in the name). The auxiliary LOOO regression only the required results are stored.

Using the LOO measures is currently only recommended if the data set is not too large. One possible approach for LOOO measures would be to identify possible problem observations with the _internal measures, and then run the leave-one-observation-out only with observations that are possible outliers. (However, this is not yet available in an automized way.)

This should be extended to general least squares.

The leave-one-variable-out (LOVO) auxiliary regression are currently not used.

Methods

cooks_distance() (cached attribute) Cooks distance
cov_ratio() (cached attribute) covariance ratio between LOOO and original
det_cov_params_not_obsi() (cached attribute) determinant of cov_params of all LOOO regressions
dfbetas() (cached attribute) dfbetas
dffits() (cached attribute) dffits measure for influence of an observation
dffits_internal() (cached attribute) dffits measure for influence of an observation
ess_press() (cached attribute) error sum of squares of PRESS residuals
get_resid_studentized_external([sigma]) calculate studentized residuals
hat_diag_factor() (cached attribute) factor of diagonal of hat_matrix used in influence
hat_matrix_diag() (cached attribute) diagonal of the hat_matrix for OLS
influence() (cached attribute) influence measure
params_not_obsi() (cached attribute) parameter estimates for all LOOO regressions
resid_press() (cached attribute) PRESS residuals
resid_std() (cached attribute) estimate of standard deviation of the residuals
resid_studentized_external() (cached attribute) studentized residuals using LOOO variance
resid_studentized_internal() (cached attribute) studentized residuals using variance from OLS
resid_var() (cached attribute) estimate of variance of the residuals
sigma2_not_obsi() (cached attribute) error variance for all LOOO regressions
summary_frame() Creates a DataFrame with all available influence results.
summary_table([float_fmt]) create a summary table with all influence and outlier measures

Methods

cooks_distance() (cached attribute) Cooks distance
cov_ratio() (cached attribute) covariance ratio between LOOO and original
det_cov_params_not_obsi() (cached attribute) determinant of cov_params of all LOOO regressions
dfbetas() (cached attribute) dfbetas
dffits() (cached attribute) dffits measure for influence of an observation
dffits_internal() (cached attribute) dffits measure for influence of an observation
ess_press() (cached attribute) error sum of squares of PRESS residuals
get_resid_studentized_external([sigma]) calculate studentized residuals
hat_diag_factor() (cached attribute) factor of diagonal of hat_matrix used in influence
hat_matrix_diag() (cached attribute) diagonal of the hat_matrix for OLS
influence() (cached attribute) influence measure
params_not_obsi() (cached attribute) parameter estimates for all LOOO regressions
resid_press() (cached attribute) PRESS residuals
resid_std() (cached attribute) estimate of standard deviation of the residuals
resid_studentized_external() (cached attribute) studentized residuals using LOOO variance
resid_studentized_internal() (cached attribute) studentized residuals using variance from OLS
resid_var() (cached attribute) estimate of variance of the residuals
sigma2_not_obsi() (cached attribute) error variance for all LOOO regressions
summary_frame() Creates a DataFrame with all available influence results.
summary_table([float_fmt]) create a summary table with all influence and outlier measures