statsmodels.nonparametric.kernel_regression.KernelReg

class statsmodels.nonparametric.kernel_regression.KernelReg(endog, exog, var_type, reg_type='ll', bw='cv_ls', defaults=<statsmodels.nonparametric._kernel_base.EstimatorSettings object>)[source]

Nonparametric kernel regression class.

Calculates the conditional mean E[y|X] where y = g(X) + e. Note that the “local constant” type of regression provided here is also known as Nadaraya-Watson kernel regression; “local linear” is an extension of that which suffers less from bias issues at the edge of the support.

Parameters:

endog: list with one element which is array_like

This is the dependent variable.

exog: list

The training data for the independent variable(s) Each element in the list is a separate variable

var_type: str

The type of the variables, one character per variable:

  • c: continuous
  • u: unordered (discrete)
  • o: ordered (discrete)

reg_type: {‘lc’, ‘ll’}, optional

Type of regression estimator. ‘lc’ means local constant and ‘ll’ local Linear estimator. Default is ‘ll’

bw: str or array_like, optional

Either a user-specified bandwidth or the method for bandwidth selection. If a string, valid values are ‘cv_ls’ (least-squares cross-validation) and ‘aic’ (AIC Hurvich bandwidth estimation). Default is ‘cv_ls’.

defaults: EstimatorSettings instance, optional

The default values for the efficient bandwidth estimation.

Attributes

———

bw: array_like

The bandwidth parameters.

**Methods**

r-squared : calculates the R-Squared coefficient for the model.

fit : calculates the conditional mean and marginal effects.

Methods

aic_hurvich(bw[, func]) Computes the AIC Hurvich criteria for the estimation of the bandwidth.
cv_loo(bw, func) The cross-validation function with leave-one-out estimator.
fit([data_predict]) Returns the mean and marginal effects at the data_predict points.
r_squared() Returns the R-Squared for the nonparametric regression.
sig_test(var_pos[, nboot, nested_res, pivot]) Significance test for the variables in the regression.