Robust Linear Models

Robust linear models with support for the M-estimators listed under Norms.

See Module Reference for commands and arguments.

Examples

# Load modules and data
In [1]: import statsmodels.api as sm

In [2]: data = sm.datasets.stackloss.load(as_pandas=False)

In [3]: data.exog = sm.add_constant(data.exog)

# Fit model and print summary
In [4]: rlm_model = sm.RLM(data.endog, data.exog, M=sm.robust.norms.HuberT())

In [5]: rlm_results = rlm_model.fit()

In [6]: print(rlm_results.params)
[-41.02649835   0.82938433   0.92606597  -0.12784672]

Detailed examples can be found here:

Technical Documentation

References

  • PJ Huber. ‘Robust Statistics’ John Wiley and Sons, Inc., New York. 1981.

  • PJ Huber. 1973, ‘The 1972 Wald Memorial Lectures: Robust Regression: Asymptotics, Conjectures, and Monte Carlo.’ The Annals of Statistics, 1.5, 799-821.

  • R Venables, B Ripley. ‘Modern Applied Statistics in S’ Springer, New York,

  • C Croux, PJ Rousseeuw, ‘Time-efficient algorithms for two highly robust estimators of scale’ Computational statistics. Physica, Heidelberg, 1992.

Module Reference

Model Classes

RLM(endog, exog[, M, missing])

Robust Linear Model

Model Results

RLMResults(model, params, …)

Class to contain RLM results

Norms

AndrewWave([a])

Andrew’s wave for M estimation.

Hampel([a, b, c])

Hampel function for M-estimation.

HuberT([t])

Huber’s T for M estimation.

LeastSquares()

Least squares rho for M-estimation and its derived functions.

RamsayE([a])

Ramsay’s Ea for M estimation.

RobustNorm()

The parent class for the norms used for robust regression.

TrimmedMean([c])

Trimmed mean function for M-estimation.

TukeyBiweight([c])

Tukey’s biweight function for M-estimation.

estimate_location(a, scale[, norm, axis, …])

M-estimator of location using self.norm and a current estimator of scale.

Scale

Huber([c, tol, maxiter, norm])

Huber’s proposal 2 for estimating location and scale jointly.

HuberScale([d, tol, maxiter])

Huber’s scaling for fitting robust linear models.

mad(a[, c, axis, center])

The Median Absolute Deviation along given axis of an array

hubers_scale

Huber’s scaling for fitting robust linear models.

iqr(a[, c, axis])

The normalized interquartile range along given axis of an array

qn_scale(a[, c, axis])

Computes the Qn robust estimator of scale