statsmodels.othermod.betareg.BetaModel¶
- class statsmodels.othermod.betareg.BetaModel(endog, exog, exog_precision=None, link=<statsmodels.genmod.families.links.Logit object>, link_precision=<statsmodels.genmod.families.links.Log object>, **kwds)[source]¶
Beta Regression.
The Model is parameterized by mean and precision. Both can depend on explanatory variables through link functions.
- Parameters:
- endogarray_like
1d array of endogenous response variable.
- exogarray_like
A nobs x k array where nobs is the number of observations and k is the number of regressors. An intercept is not included by default and should be added by the user (models specified using a formula include an intercept by default). See statsmodels.tools.add_constant.
- exog_precisionarray_like
2d array of variables for the precision.
- link
link
Any link in sm.families.links for mean, should have range in interval [0, 1]. Default is logit-link.
- link_precision
link
Any link in sm.families.links for precision, should have range in positive line. Default is log-link.
- **kwds
extra
keywords
Keyword options that will be handled by super classes. Not all general keywords will be supported in this class.
See also
Notes
Status: experimental, new in 0.13. Core results are verified, but api can change and some extra results specific to Beta regression are missing.
Examples
Beta regression with default of logit-link for exog and log-link for precision.
>>> mod = BetaModel(endog, exog) >>> rslt = mod.fit() >>> print(rslt.summary())
We can also specify a formula and a specific structure and use the identity-link for precision.
>>> from sm.families.links import identity >>> Z = patsy.dmatrix('~ temp', dat, return_type='dataframe') >>> mod = BetaModel.from_formula('iyield ~ C(batch, Treatment(10)) + temp', ... dat, exog_precision=Z, ... link_precision=identity())
In the case of proportion-data, we may think that the precision depends on the number of measurements. E.g for sequence data, on the number of sequence reads covering a site:
>>> Z = patsy.dmatrix('~ coverage', df) >>> formula = 'methylation ~ disease + age + gender + coverage' >>> mod = BetaModel.from_formula(formula, df, Z) >>> rslt = mod.fit()
- Attributes:
endog_names
Names of endogenous variables.
exog_names
Names of exogenous variables.
Methods
expandparams
(params)expand to full parameter array when some parameters are fixed
fit
([start_params, maxiter, disp, method])Fit the model by maximum likelihood.
from_formula
(formula, data[, ...])Create a Model from a formula and dataframe.
get_distribution
(params[, exog, exog_precision])Return a instance of the predictive distribution.
get_distribution_params
(params[, exog, ...])Return distribution parameters converted from model prediction.
hessian
(params[, observed])Hessian, second derivative of loglikelihood function
hessian_factor
(params[, scale, observed])Weights for calculating Hessian
information
(params)Fisher information matrix of model.
Initialize (possibly re-initialize) a Model instance.
loglike
(params)Log-likelihood of model at params
loglikeobs
(params)Loglikelihood for observations of the Beta regressionmodel.
nloglike
(params)Negative log-likelihood of model at params
predict
(params[, exog, exog_precision, which])Predict values for mean or precision
predict_precision
(params[, exog_precision])Predict values for precision function for given exog_precision.
predict_var
(params[, exog, exog_precision])predict values for conditional variance V(endog | exog)
reduceparams
(params)Reduce parameters
score
(params)Returns the score vector of the log-likelihood.
score_factor
(params)Derivative of loglikelihood function w.r.t.
score_hessian_factor
(params[, ...])Derivatives of loglikelihood function w.r.t.
score_obs
(params)Score, first derivative of the loglikelihood for each observation.
Properties
Names of endogenous variables.
Names of exogenous variables.