Treatment effects under conditional independence¶
Author: Josef Perktold
This notebook illustrates the basic usage of the new treatment effect functionality in statsmodels.
The main class is statsmodels.treatment.treatment_effects.TreatmentEffect
.
This class estimates treatment effect and potential outcome using 5 different methods, ipw, ra, aipw, aipw-wls, ipw-ra. The last three methods require both a treatment or selection model and an outcome model. Standard errors and inference are based on the joint GMM representation of selection or treatment model, outcome model and effect functions. The approach for inference follows Stata, however Stata support a wider range of models. Estimation and inference are valid under conditional independence or ignorability.
The outcome model is currently limited to a linear model based on OLS. Treatment is currently restricted to binary treatment which can be either Logit or Probit.
The example follows Cattaneo.
[1]:
import os
import numpy as np
from numpy.testing import assert_allclose
import pandas as pd
from statsmodels.regression.linear_model import OLS
from statsmodels.discrete.discrete_model import Probit
from statsmodels.treatment.treatment_effects import (
TreatmentEffect
)
from statsmodels.treatment.tests.results import results_teffects as res_st
# Load data for example
cur_dir = os.path.abspath(os.path.dirname(res_st.__file__))
file_name = 'cataneo2.csv'
file_path = os.path.join(cur_dir, file_name)
dta_cat = pd.read_csv(file_path)
methods = ['ra', 'ipw', 'aipw', 'aipw_wls', 'ipw_ra']
methods_st = [
("ra", res_st.results_ra),
("ipw", res_st.results_ipw),
("aipw", res_st.results_aipw),
("aipw_wls", res_st.results_aipw_wls),
("ipw_ra", res_st.results_ipwra),
]
# allow wider display of data frames
pd.set_option('display.width', 500)
[2]:
dta_cat.head()
[2]:
bweight | mmarried | mhisp | fhisp | foreign | alcohol | deadkids | mage | medu | fage | ... | prenatal | birthmonth | lbweight | fbaby | prenatal1 | mbsmoke_ | mmarried_ | fbaby_ | prenatal1_ | mage2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3459 | married | 0 | 0 | 0 | 0 | 0 | 24 | 14 | 28 | ... | 1 | 12 | 0 | No | Yes | 0 | 1 | 0 | 1 | 576.0 |
1 | 3260 | notmarried | 0 | 0 | 1 | 0 | 0 | 20 | 10 | 0 | ... | 1 | 7 | 0 | No | Yes | 0 | 0 | 0 | 1 | 400.0 |
2 | 3572 | married | 0 | 0 | 1 | 0 | 0 | 22 | 9 | 30 | ... | 1 | 3 | 0 | No | Yes | 0 | 1 | 0 | 1 | 484.0 |
3 | 2948 | married | 0 | 0 | 0 | 0 | 0 | 26 | 12 | 30 | ... | 1 | 1 | 0 | No | Yes | 0 | 1 | 0 | 1 | 676.0 |
4 | 2410 | married | 0 | 0 | 0 | 0 | 0 | 20 | 12 | 21 | ... | 1 | 3 | 1 | Yes | Yes | 0 | 1 | 1 | 1 | 400.0 |
5 rows × 28 columns
Create TreatmentEffect instance and compute ipw¶
The TreatmentEffect class requires
a OLS model instance for the outcome model,
a results instance of the selection model and
a treatment indicator variable.
In the following example we use Probit as the selection model. Using Logit is also supported.
[3]:
# treatment selection model
formula = 'mbsmoke_ ~ mmarried_ + mage + mage2 + fbaby_ + medu'
res_probit = Probit.from_formula(formula, dta_cat).fit()
# outcome model
formula_outcome = 'bweight ~ prenatal1_ + mmarried_ + mage + fbaby_'
mod = OLS.from_formula(formula_outcome, dta_cat)
# treatment indicator variable
tind = np.asarray(dta_cat['mbsmoke_'])
teff = TreatmentEffect(mod, tind, results_select=res_probit)
Optimization terminated successfully.
Current function value: 0.439575
Iterations 6
After creating the TreatmentEffect instance, we can call any of the 5 methods to compute potential outcomes, POM0, POM1, and average treatment effect, ATE. POM0 is the potential outcome for the no treatment group, POM1 is the potential outcome for the treatment group, treatment effect is POM1 - POM0.
For example teff.ipw()
computes POM and ATE using inverse probability weighting. The probability of treatment is also commonly called the propensity score. The summary
of the estimation includes standard errors and confidence interval for POM and ATE.
Standard errors and other inferential statistics are based on the Generalized Method of Moments (GMM) representation of the selection and outcome models and the moment conditions for the results statistic. Method ipw
uses the selection model but not the outcome model. Method ra
uses the outcome model but not the selection model. The doubly robust estimators aipw
, aipw-wls
, ipw-ra
include both selection and outcome models, where at least one of those two has to be correctly
specified to get consistent estimates of the treatment effect. The moment conditions for the target variables, POM0, POM1, and ATE are based on POM0 and ATE. The remaining POM1 is computed as a linear combination of POM0 and ATE.
The internal gmm results are attached to the treatment results as results_gmm
.
By default the treatment effect methods computes average treatment effect, where average is take over the sample observations. Option effect_group
can be used to compute either average treatment effect on the treated, ATT, using effect_group=1
or average treatment effect on the non-treated using effect_group=0
.
[4]:
res = teff.ipw()
res
[4]:
<class 'statsmodels.treatment.treatment_effects.TreatmentEffectResults'>
Test for Constraints
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ATE -230.6891 25.817 -8.936 0.000 -281.289 -180.089
POM0 3403.4632 9.571 355.586 0.000 3384.704 3422.223
POM1 3172.7741 24.001 132.193 0.000 3125.733 3219.815
==============================================================================
[5]:
res.summary_frame()
[5]:
coef | std err | z | P>|z| | Conf. Int. Low | Conf. Int. Upp. | |
---|---|---|---|---|---|---|
ATE | -230.689070 | 25.816758 | -8.935633 | 4.048542e-19 | -281.288985 | -180.089154 |
POM0 | 3403.463163 | 9.571412 | 355.586324 | 0.000000e+00 | 3384.703540 | 3422.222785 |
POM1 | 3172.774093 | 24.001059 | 132.193085 | 0.000000e+00 | 3125.732881 | 3219.815305 |
[6]:
print(res.results_gmm.summary())
_IPWGMM Results
==============================================================================
Dep. Variable: y Hansen J: 3.988e-09
Model: _IPWGMM Prob (Hansen J): nan
Method: GMM
Date: Tue, 07 Jan 2025
Time: 18:43:53
No. Observations: 4642
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
p 0 -230.6891 25.817 -8.936 0.000 -281.289 -180.089
p 1 3403.4632 9.571 355.586 0.000 3384.704 3422.223
p 2 -1.5583 0.461 -3.380 0.001 -2.462 -0.655
p 3 -0.6485 0.055 -11.711 0.000 -0.757 -0.540
p 4 0.1744 0.036 4.836 0.000 0.104 0.245
p 5 -0.0033 0.001 -4.921 0.000 -0.005 -0.002
p 6 -0.2176 0.050 -4.390 0.000 -0.315 -0.120
p 7 -0.0864 0.010 -8.630 0.000 -0.106 -0.067
==============================================================================
average treatment effect on the treated
see more below
[7]:
teff.ipw(effect_group=1)
[7]:
<class 'statsmodels.treatment.treatment_effects.TreatmentEffectResults'>
Test for Constraints
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ATE -225.1796 23.658 -9.518 0.000 -271.549 -178.811
POM0 3362.8393 14.198 236.855 0.000 3335.012 3390.667
POM1 3137.6597 19.071 164.526 0.000 3100.281 3175.038
==============================================================================
average treatment effect on the untreated
[8]:
teff.ipw(effect_group=0)
[8]:
<class 'statsmodels.treatment.treatment_effects.TreatmentEffectResults'>
Test for Constraints
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ATE -231.8782 27.699 -8.371 0.000 -286.168 -177.588
POM0 3412.9116 9.283 367.634 0.000 3394.716 3431.107
POM1 3181.0334 26.120 121.786 0.000 3129.840 3232.227
==============================================================================
Other methods to compute ATE work in the same or similar way as for ipw
for example regression adjustment ra
and double robust ipw_ra
.
[9]:
res_ra = teff.ra()
res_ra
[9]:
<class 'statsmodels.treatment.treatment_effects.TreatmentEffectResults'>
Test for Constraints
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ATE -239.6392 23.824 -10.059 0.000 -286.333 -192.945
POM0 3403.2423 9.525 357.288 0.000 3384.573 3421.911
POM1 3163.6031 21.864 144.698 0.000 3120.751 3206.455
==============================================================================
[10]:
res_ra.summary_frame()
[10]:
coef | std err | z | P>|z| | Conf. Int. Low | Conf. Int. Upp. | |
---|---|---|---|---|---|---|
ATE | -239.639211 | 23.824021 | -10.058722 | 8.408247e-24 | -286.333435 | -192.944988 |
POM0 | 3403.242272 | 9.525207 | 357.288006 | 0.000000e+00 | 3384.573209 | 3421.911335 |
POM1 | 3163.603060 | 21.863509 | 144.697867 | 0.000000e+00 | 3120.751371 | 3206.454750 |
[11]:
ra2 = teff.ipw_ra(effect_group=1, return_results=True)
ra2.summary_frame()
[11]:
coef | std err | z | P>|z| | Conf. Int. Low | Conf. Int. Upp. | |
---|---|---|---|---|---|---|
ATE | -223.545262 | 23.794008 | -9.395023 | 5.720507e-21 | -270.180660 | -176.909864 |
POM0 | 3361.204984 | 14.465009 | 232.367989 | 0.000000e+00 | 3332.854088 | 3389.555880 |
POM1 | 3137.659722 | 19.070923 | 164.525844 | 0.000000e+00 | 3100.281400 | 3175.038045 |
All methods in TreatmentEffect¶
The following computes and prints ATE and POM for all methods. (We include the call to TreatmentEffect as a reminder.)
[12]:
teff = TreatmentEffect(mod, tind, results_select=res_probit)
for m in methods:
res = getattr(teff, m)()
print("\n", m)
print(res.summary_frame())
ra
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -239.639211 23.824021 -10.058722 8.408247e-24 -286.333435 -192.944988
POM0 3403.242272 9.525207 357.288006 0.000000e+00 3384.573209 3421.911335
POM1 3163.603060 21.863509 144.697867 0.000000e+00 3120.751371 3206.454750
ipw
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -230.689070 25.816758 -8.935633 4.048542e-19 -281.288985 -180.089154
POM0 3403.463163 9.571412 355.586324 0.000000e+00 3384.703540 3422.222785
POM1 3172.774093 24.001059 132.193085 0.000000e+00 3125.732881 3219.815305
aipw
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -230.989648 26.214445 -8.811541 1.234375e-18 -282.369017 -179.610280
POM0 3403.355674 9.568514 355.682783 0.000000e+00 3384.601731 3422.109616
POM1 3172.366025 24.427402 129.869153 0.000000e+00 3124.489197 3220.242854
aipw_wls
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -227.195618 27.372036 -8.300282 1.038645e-16 -280.843822 -173.547414
POM0 3403.250651 9.596571 354.631943 0.000000e+00 3384.441717 3422.059585
POM1 3176.055033 25.654642 123.800406 0.000000e+00 3125.772859 3226.337206
ipw_ra
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -229.967078 26.629411 -8.635830 5.830196e-18 -282.159765 -177.774391
POM0 3403.335639 9.571288 355.577620 0.000000e+00 3384.576260 3422.095018
POM1 3173.368561 24.871955 127.588224 0.000000e+00 3124.620425 3222.116697
Results in Stata¶
The results in statsmodels are very close to the results in Stata because both packages use the same approach.
[13]:
for m, st in methods_st:
print("\n", m)
res = pd.DataFrame(st.table[:2, :6], index = ["ATE", "POM0"], columns=st.table_colnames[:6])
print(res)
ra
b se z pvalue ll ul
ATE -239.639211 23.824021 -10.058722 8.408247e-24 -286.333435 -192.944988
POM0 3403.242272 9.525207 357.288005 0.000000e+00 3384.573209 3421.911335
ipw
b se z pvalue ll ul
ATE -230.688638 25.815244 -8.936140 4.030006e-19 -281.285586 -180.091690
POM0 3403.462709 9.571369 355.587873 0.000000e+00 3384.703170 3422.222247
aipw
b se z pvalue ll ul
ATE -230.989201 26.210565 -8.812828 1.220276e-18 -282.360964 -179.617438
POM0 3403.355253 9.568472 355.684297 0.000000e+00 3384.601393 3422.109114
aipw_wls
b se z pvalue ll ul
ATE -227.195618 27.347936 -8.307597 9.765984e-17 -280.796587 -173.594649
POM0 3403.250651 9.596622 354.630065 0.000000e+00 3384.441618 3422.059684
ipw_ra
b se z pvalue ll ul
ATE -229.967078 26.626676 -8.636718 5.785117e-18 -282.154403 -177.779752
POM0 3403.335639 9.571260 355.578657 0.000000e+00 3384.576315 3422.094963
Treatment effects without inference¶
It is possible to compute POM and ATE without computing standard errors and inferential statistics. In this case the GMM model is not computed.
[14]:
for m in methods:
print("\n", m)
res = getattr(teff, m)(return_results=False)
print(res)
ra
(np.float64(-239.6392114643395), np.float64(3403.242271935487), np.float64(3163.6030604711477))
ipw
(np.float64(-230.6886377952617), np.float64(3403.4627086845567), np.float64(3172.7740708892948))
aipw
(np.float64(-230.98920111257803), np.float64(3403.3552531738355), np.float64(3172.3660520612575))
aipw_wls
(np.float64(-227.19561818674902), np.float64(3403.2506509757864), np.float64(3176.0550327890373))
ipw_ra
(np.float64(-229.96707793513224), np.float64(3403.3356393074205), np.float64(3173.3685613722882))
Treatment effect on the treated¶
Treatment effects on subgroups are not available for aipw
and aipw-wls
.
effect_group
choses the group for which treatement effect and potential outcomes are computed Options are “all” for sample average treatment effect, 1
for average treatment effect on the treated and 0
for average treatment effect on the untreated.
Note: The row labels in the pandas dataframe, POM and ATE, are the same even for treatment effect on subgroups.
[15]:
for m in methods:
if m.startswith("aipw"):
continue
res = getattr(teff, m)(effect_group=1)
print("\n", m)
print(res.summary_frame())
ra
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -223.301651 22.742195 -9.818826 9.342545e-23 -267.875534 -178.727767
POM0 3360.961373 12.757489 263.450069 0.000000e+00 3335.957154 3385.965592
POM1 3137.659722 19.070923 164.525844 0.000000e+00 3100.281400 3175.038045
ipw
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -225.179608 23.658119 -9.518069 1.764269e-21 -271.548669 -178.810546
POM0 3362.839334 14.197866 236.855264 0.000000e+00 3335.012028 3390.666640
POM1 3137.659726 19.070923 164.525845 0.000000e+00 3100.281404 3175.038049
ipw_ra
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -223.545262 23.794008 -9.395023 5.720507e-21 -270.180660 -176.909864
POM0 3361.204984 14.465009 232.367989 0.000000e+00 3332.854088 3389.555880
POM1 3137.659722 19.070923 164.525844 0.000000e+00 3100.281400 3175.038045
Treatment effect on the untreated¶
Similar to ATT, we can compute average treatment effect on the untreated by using option effect_group=0
.
[16]:
for m in methods:
if m.startswith("aipw"):
# not available
continue
res = getattr(teff, m)(effect_group=0)
print("\n", m)
print(res.summary_frame())
ra
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -243.375488 24.902030 -9.773319 1.465697e-22 -292.182569 -194.568406
POM0 3412.911593 9.283454 367.633804 0.000000e+00 3394.716358 3431.106829
POM1 3169.536106 23.128805 137.038471 0.000000e+00 3124.204480 3214.867731
ipw
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -231.878176 27.699436 -8.371224 5.702294e-17 -286.168073 -177.588279
POM0 3412.911593 9.283454 367.633804 0.000000e+00 3394.716357 3431.106829
POM1 3181.033418 26.119760 121.786472 0.000000e+00 3129.839629 3232.227206
ipw_ra
coef std err z P>|z| Conf. Int. Low Conf. Int. Upp.
ATE -231.125972 28.813022 -8.021580 1.043933e-15 -287.598458 -174.653487
POM0 3412.911593 9.283454 367.633804 0.000000e+00 3394.716358 3431.106829
POM1 3181.785621 27.301318 116.543297 0.000000e+00 3128.276021 3235.295221
The docstring for the TreatmentEffect class and it’s methods can be obtained using help
help(teff)