statsmodels.regression.linear_model.OLS¶

class statsmodels.regression.linear_model.OLS(endog, exog=None, missing='none', hasconst=None, **kwargs)[source]¶

Ordinary Least Squares

Parameters:¶

endogarray_like: A 1-d endogenous response variable. The dependent variable.
exogarray_like: A nobs x k array where nobs is the number of observations and k is the number of regressors. An intercept is not included by default and should be added by the user. See statsmodels.tools.add_constant.
missingstr: Available options are ‘none’, ‘drop’, and ‘raise’. If ‘none’, no nan checking is done. If ‘drop’, any observations with nans are dropped. If ‘raise’, an error is raised. Default is ‘none’.
hasconstNone or bool: Indicates whether the RHS includes a user-supplied constant. If True, a constant is not checked for and k_constant is set to 1 and all result statistics are calculated as if a constant is present. If False, a constant is not checked for and k_constant is set to 0.
**kwargs: Extra arguments that are used to set model properties when using the formula interface.

Attributes:¶

weightsscalar: Has an attribute weights = array(1.0) due to inheritance from WLS.

See also

WLS: Fit a linear model using Weighted Least Squares.
GLS: Fit a linear model using Generalized Least Squares.

Notes

No constant is added by the model unless you are using formulas.

Examples

>>> import statsmodels.api as sm
>>> import numpy as np
>>> duncan_prestige = sm.datasets.get_rdataset("Duncan", "carData")
>>> Y = duncan_prestige.data['income']
>>> X = duncan_prestige.data['education']
>>> X = sm.add_constant(X)
>>> model = sm.OLS(Y,X)
>>> results = model.fit()
>>> results.params
const        10.603498
education     0.594859
dtype: float64

>>> results.tvalues
const        2.039813
education    6.892802
dtype: float64

>>> print(results.t_test([1, 0]))
                             Test for Constraints
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
c0            10.6035      5.198      2.040      0.048       0.120      21.087
==============================================================================

>>> print(results.f_test(np.identity(2)))
<F test: F=array([[159.63031026]]), p=1.2607168903696672e-20,
 df_denom=43, df_num=2>

Methods

`fit`([method, cov_type, cov_kwds, use_t])	Full fit of the model.
`fit_regularized`([method, alpha, L1_wt, ...])	Return a regularized fit to a linear regression model.
`from_formula`(formula, data[, subset, drop_cols])	Create a Model from a formula and dataframe.
`get_distribution`(params, scale[, exog, ...])	Construct a random number generator for the predictive distribution.
`hessian`(params[, scale])	Evaluate the Hessian function at a given point.
`hessian_factor`(params[, scale, observed])	Calculate the weights for the Hessian.
`information`(params)	Fisher information matrix of model.
`initialize`()	Initialize model components.
`loglike`(params[, scale])	The likelihood function for the OLS model.
`predict`(params[, exog])	Return linear predicted values from a design matrix.
`score`(params[, scale])	Evaluate the score function at a given point.
`whiten`(x)	OLS model whitener does nothing.

Properties

`df_model`	The model degree of freedom.
`df_resid`	The residual degree of freedom.
`endog_names`	Names of endogenous variables.
`exog_names`	Names of exogenous variables.