statsmodels.discrete.discrete_model.BinaryModel.fit_regularized¶
method
-
BinaryModel.
fit_regularized
(start_params=None, method='l1', maxiter='defined_by_method', full_output=1, disp=1, callback=None, alpha=0, trim_mode='auto', auto_trim_tol=0.01, size_trim_tol=0.0001, qc_tol=0.03, **kwargs)[source]¶ Fit the model using a regularized maximum likelihood. The regularization method AND the solver used is determined by the argument method.
- Parameters
- start_paramsarray-like, optional
Initial guess of the solution for the loglikelihood maximization. The default is an array of zeros.
- method‘l1’ or ‘l1_cvxopt_cp’
See notes for details.
- maxiterInteger or ‘defined_by_method’
Maximum number of iterations to perform. If ‘defined_by_method’, then use method defaults (see notes).
- full_outputbool
Set to True to have all available output in the Results object’s mle_retvals attribute. The output is dependent on the solver. See LikelihoodModelResults notes section for more information.
- dispbool
Set to True to print convergence messages.
- fargstuple
Extra arguments passed to the likelihood function, i.e., loglike(x,*args)
- callbackcallable callback(xk)
Called after each iteration, as callback(xk), where xk is the current parameter vector.
- retallbool
Set to True to return list of solutions at each iteration. Available in Results object’s mle_retvals attribute.
- alphanon-negative scalar or numpy array (same size as parameters)
The weight multiplying the l1 penalty term
- trim_mode‘auto, ‘size’, or ‘off’
If not ‘off’, trim (set to zero) parameters that would have been zero if the solver reached the theoretical minimum. If ‘auto’, trim params using the Theory above. If ‘size’, trim params if they have very small absolute value
- size_trim_tolfloat or ‘auto’ (default = ‘auto’)
For use when trim_mode == ‘size’
- auto_trim_tolfloat
For sue when trim_mode == ‘auto’. Use
- qc_tolfloat
Print warning and don’t allow auto trim when (ii) (above) is violated by this much.
- qc_verboseBoolean
If true, print out a full QC report upon failure
Notes
Extra parameters are not penalized if alpha is given as a scalar. An example is the shape parameter in NegativeBinomial nb1 and nb2.
Optional arguments for the solvers (available in Results.mle_settings):
'l1' acc : float (default 1e-6) Requested accuracy as used by slsqp 'l1_cvxopt_cp' abstol : float absolute accuracy (default: 1e-7). reltol : float relative accuracy (default: 1e-6). feastol : float tolerance for feasibility conditions (default: 1e-7). refinement : int number of iterative refinement steps when solving KKT equations (default: 1).
Optimization methodology
With \(L\) the negative log likelihood, we solve the convex but non-smooth problem
\[\min_\beta L(\beta) + \sum_k\alpha_k |\beta_k|\]via the transformation to the smooth, convex, constrained problem in twice as many variables (adding the “added variables” \(u_k\))
\[\min_{\beta,u} L(\beta) + \sum_k\alpha_k u_k,\]subject to
\[-u_k \leq \beta_k \leq u_k.\]With \(\partial_k L\) the derivative of \(L\) in the \(k^{th}\) parameter direction, theory dictates that, at the minimum, exactly one of two conditions holds:
\(|\partial_k L| = \alpha_k\) and \(\beta_k \neq 0\)
\(|\partial_k L| \leq \alpha_k\) and \(\beta_k = 0\)