Tools

Our tool collection contains some convenience functions for users and functions that were written mainly for internal use.

Additional to this tools directory, several other subpackages have their own tools modules, for example statsmodels.tsa.tsatools

Module Reference

Basic tools tools

These are basic and miscellaneous tools. The full import path is statsmodels.tools.tools.

tools.add_constant(data[, prepend, has_constant]) Adds a column of ones to an array

The next group are mostly helper functions that are not separately tested or insufficiently tested.

tools.categorical(data[, col, dictnames, drop]) Returns a dummy matrix given an array of categorical variables.
tools.clean0(matrix) Erase columns of zeros: can save some time in pseudoinverse.
tools.fullrank(X[, r]) Return a matrix whose column span is the same as X.
tools.isestimable(C, D) True if (Q, P) contrast C is estimable for (N, P) design D
tools.recipr(x) Return the reciprocal of an array, setting all entries less than or equal to 0 to 0.
tools.recipr0(x) Return the reciprocal of an array, setting all entries equal to 0 as 0.
tools.unsqueeze(data, axis, oldshape) Unsqueeze a collapsed array

Numerical Differentiation

numdiff.approx_fprime(x, f[, epsilon, args, …]) Gradient of function, or Jacobian if function f returns 1d array
numdiff.approx_fprime_cs(x, f[, epsilon, …]) Calculate gradient or Jacobian with complex step derivative approximation
numdiff.approx_hess1(x, f[, epsilon, args, …]) Calculate Hessian with finite difference derivative approximation
numdiff.approx_hess2(x, f[, epsilon, args, …]) Calculate Hessian with finite difference derivative approximation
numdiff.approx_hess3(x, f[, epsilon, args, …]) Calculate Hessian with finite difference derivative approximation
numdiff.approx_hess_cs(x, f[, epsilon, …]) Calculate Hessian with complex-step derivative approximation Calculate Hessian with finite difference derivative approximation

Measure for fit performance eval_measures

The first group of function in this module are standalone versions of information criteria, aic bic and hqic. The function with _sigma suffix take the error sum of squares as argument, those without, take the value of the log-likelihood, llf, as argument.

The second group of function are measures of fit or prediction performance, which are mostly one liners to be used as helper functions. All of those calculate a performance or distance statistic for the difference between two arrays. For example in the case of Monte Carlo or cross-validation, the first array would be the estimation results for the different replications or draws, while the second array would be the true or observed values.

eval_measures.aic(llf, nobs, df_modelwc) Akaike information criterion
eval_measures.aic_sigma(sigma2, nobs, df_modelwc) Akaike information criterion
eval_measures.aicc(llf, nobs, df_modelwc) Akaike information criterion (AIC) with small sample correction
eval_measures.aicc_sigma(sigma2, nobs, …) Akaike information criterion (AIC) with small sample correction
eval_measures.bic(llf, nobs, df_modelwc) Bayesian information criterion (BIC) or Schwarz criterion
eval_measures.bic_sigma(sigma2, nobs, df_modelwc) Bayesian information criterion (BIC) or Schwarz criterion
eval_measures.hqic(llf, nobs, df_modelwc) Hannan-Quinn information criterion (HQC)
eval_measures.hqic_sigma(sigma2, nobs, …) Hannan-Quinn information criterion (HQC)
eval_measures.bias(x1, x2[, axis]) bias, mean error
eval_measures.iqr(x1, x2[, axis]) interquartile range of error
eval_measures.maxabs(x1, x2[, axis]) maximum absolute error
eval_measures.meanabs(x1, x2[, axis]) mean absolute error
eval_measures.medianabs(x1, x2[, axis]) median absolute error
eval_measures.medianbias(x1, x2[, axis]) median bias, median error
eval_measures.mse(x1, x2[, axis]) mean squared error
eval_measures.rmse(x1, x2[, axis]) root mean squared error
eval_measures.stde(x1, x2[, ddof, axis]) standard deviation of error
eval_measures.vare(x1, x2[, ddof, axis]) variance of error