statsmodels.nonparametric.kernel_regression.KernelReg

class statsmodels.nonparametric.kernel_regression.KernelReg(endog, exog, var_type, reg_type='ll', bw='cv_ls', ckertype='gaussian', okertype='wangryzin', ukertype='aitchisonaitken', defaults=None)[source]

Nonparametric kernel regression class.

Calculates the conditional mean E[y|X] where y = g(X) + e. Note that the “local constant” type of regression provided here is also known as Nadaraya-Watson kernel regression; “local linear” is an extension of that which suffers less from bias issues at the edge of the support. Note that specifying a custom kernel works only with “local linear” kernel regression. For example, a custom tricube kernel yields LOESS regression.

Parameters
endogarray_like

This is the dependent variable.

exogarray_like

The training data for the independent variable(s) Each element in the list is a separate variable

var_typestr

The type of the variables, one character per variable:

  • c: continuous

  • u: unordered (discrete)

  • o: ordered (discrete)

reg_type{‘lc’, ‘ll’}, optional

Type of regression estimator. ‘lc’ means local constant and ‘ll’ local Linear estimator. Default is ‘ll’

bwstr or array_like, optional

Either a user-specified bandwidth or the method for bandwidth selection. If a string, valid values are ‘cv_ls’ (least-squares cross-validation) and ‘aic’ (AIC Hurvich bandwidth estimation). Default is ‘cv_ls’. User specified bandwidth must have as many entries as the number of variables.

ckertypestr, optional

The kernel used for the continuous variables.

okertypestr, optional

The kernel used for the ordered discrete variables.

ukertypestr, optional

The kernel used for the unordered discrete variables.

defaultsEstimatorSettings instance, optional

The default values for the efficient bandwidth estimation.

Attributes
bwarray_like

The bandwidth parameters.

Methods

aic_hurvich(bw[, func])

Computes the AIC Hurvich criteria for the estimation of the bandwidth.

cv_loo(bw, func)

The cross-validation function with leave-one-out estimator.

fit([data_predict])

Returns the mean and marginal effects at the data_predict points.

loo_likelihood()

r_squared()

Returns the R-Squared for the nonparametric regression.

sig_test(var_pos[, nboot, nested_res, pivot])

Significance test for the variables in the regression.