statsmodels.stats.oaxaca.OaxacaBlinder

class statsmodels.stats.oaxaca.OaxacaBlinder(endog, exog, bifurcate, hasconst=True, swap=True, cov_type='nonrobust', cov_kwds=None)[source]

Class to perform Oaxaca-Blinder Decomposition.

Parameters:
endogarray_like

The endogenous variable or the dependent variable that you are trying to explain.

exogarray_like

The exogenous variable(s) or the independent variable(s) that you are using to explain the endogenous variable.

bifurcate{int, str}

The column of the exogenous variable(s) on which to split. This would generally be the group that you wish to explain the two means for. Int of the column for a NumPy array or int/string for the name of the column in Pandas.

hasconstbool, optional

Indicates whether the two exogenous variables include a user-supplied constant. If True, a constant is assumed. If False, a constant is added at the start. If nothing is supplied, then True is assumed.

swapbool, optional

Imitates the STATA Oaxaca command by allowing users to choose to swap groups. Unlike STATA, this is assumed to be True instead of False

cov_typestr, optional

See regression.linear_model.RegressionResults for a description of the available covariance estimators

cov_kwdsdict, optional

See linear_model.RegressionResults.get_robustcov_results for a description required keywords for alternative covariance estimators

Notes

Please check if your data includes at constant. This will still run, but will return incorrect values if set incorrectly.

You can access the models by using their code as an attribute, e.g., _t_model for the total model, _f_model for the first model, _s_model for the second model.

Examples

>>> import numpy as np
>>> import statsmodels.api as sm
>>> data = sm.datasets.ccards.load()

‘3’ is the column of which we want to explain or which indicates the two groups. In this case, it is if you rent.

>>> model = sm.OaxacaBlinder(df.endog, df.exog, 3, hasconst = False)
>>> model.two_fold().summary()
Oaxaca-Blinder Two-fold Effects
Unexplained Effect: 27.94091
Explained Effect: 130.80954
Gap: 158.75044
>>> model.three_fold().summary()
Oaxaca-Blinder Three-fold Effects
Endowments Effect: 321.74824
Coefficient Effect: 75.45371
Interaction Effect: -238.45151
Gap: 158.75044

Methods

three_fold([std, n, conf])

Calculates the three-fold Oaxaca Blinder Decompositions

two_fold([std, two_fold_type, ...])

Calculates the two-fold or pooled Oaxaca Blinder Decompositions

variance(decomp_type[, n, conf])

A helper function to calculate the variance/std.


Last update: Jan 20, 2025