statsmodels.discrete.discrete_model.Probit.fit_regularized

method

Probit.fit_regularized(start_params=None, method='l1', maxiter='defined_by_method', full_output=1, disp=1, callback=None, alpha=0, trim_mode='auto', auto_trim_tol=0.01, size_trim_tol=0.0001, qc_tol=0.03, **kwargs)

Fit the model using a regularized maximum likelihood. The regularization method AND the solver used is determined by the argument method.

Parameters
start_paramsarray-like, optional

Initial guess of the solution for the loglikelihood maximization. The default is an array of zeros.

method‘l1’ or ‘l1_cvxopt_cp’

See notes for details.

maxiterInteger or ‘defined_by_method’

Maximum number of iterations to perform. If ‘defined_by_method’, then use method defaults (see notes).

full_outputbool

Set to True to have all available output in the Results object’s mle_retvals attribute. The output is dependent on the solver. See LikelihoodModelResults notes section for more information.

dispbool

Set to True to print convergence messages.

fargstuple

Extra arguments passed to the likelihood function, i.e., loglike(x,*args)

callbackcallable callback(xk)

Called after each iteration, as callback(xk), where xk is the current parameter vector.

retallbool

Set to True to return list of solutions at each iteration. Available in Results object’s mle_retvals attribute.

alphanon-negative scalar or numpy array (same size as parameters)

The weight multiplying the l1 penalty term

trim_mode‘auto, ‘size’, or ‘off’

If not ‘off’, trim (set to zero) parameters that would have been zero if the solver reached the theoretical minimum. If ‘auto’, trim params using the Theory above. If ‘size’, trim params if they have very small absolute value

size_trim_tolfloat or ‘auto’ (default = ‘auto’)

For use when trim_mode == ‘size’

auto_trim_tolfloat

For sue when trim_mode == ‘auto’. Use

qc_tolfloat

Print warning and don’t allow auto trim when (ii) (above) is violated by this much.

qc_verboseBoolean

If true, print out a full QC report upon failure

Notes

Extra parameters are not penalized if alpha is given as a scalar. An example is the shape parameter in NegativeBinomial nb1 and nb2.

Optional arguments for the solvers (available in Results.mle_settings):

'l1'
    acc : float (default 1e-6)
        Requested accuracy as used by slsqp
'l1_cvxopt_cp'
    abstol : float
        absolute accuracy (default: 1e-7).
    reltol : float
        relative accuracy (default: 1e-6).
    feastol : float
        tolerance for feasibility conditions (default: 1e-7).
    refinement : int
        number of iterative refinement steps when solving KKT
        equations (default: 1).

Optimization methodology

With \(L\) the negative log likelihood, we solve the convex but non-smooth problem

\[\min_\beta L(\beta) + \sum_k\alpha_k |\beta_k|\]

via the transformation to the smooth, convex, constrained problem in twice as many variables (adding the “added variables” \(u_k\))

\[\min_{\beta,u} L(\beta) + \sum_k\alpha_k u_k,\]

subject to

\[-u_k \leq \beta_k \leq u_k.\]

With \(\partial_k L\) the derivative of \(L\) in the \(k^{th}\) parameter direction, theory dictates that, at the minimum, exactly one of two conditions holds:

  1. \(|\partial_k L| = \alpha_k\) and \(\beta_k \neq 0\)

  2. \(|\partial_k L| \leq \alpha_k\) and \(\beta_k = 0\)