statsmodels.base.distributed_estimation.DistributedModel

class statsmodels.base.distributed_estimation.DistributedModel(partitions, model_class=None, init_kwds=None, estimation_method=None, estimation_kwds=None, join_method=None, join_kwds=None, results_class=None, results_kwds=None)[source]

Distributed model class

Parameters:
partitions : scalar

The number of partitions that the data will be split into.

model_class : statsmodels model class

The model class which will be used for estimation. If None this defaults to OLS.

init_kwds : dict-like or None

Keywords needed for initializing the model, in addition to endog and exog.

init_kwds_generator : generator or None

Additional keyword generator that produces model init_kwds that may vary based on data partition. The current usecase is for WLS and GLS

estimation_method : function or None

The method that performs the estimation for each partition. If None this defaults to _est_regularized_debiased.

estimation_kwds : dict-like or None

Keywords to be passed to estimation_method.

join_method : function or None

The method used to recombine the results from each partition. If None this defaults to _join_debiased.

join_kwds : dict-like or None

Keywords to be passed to join_method.

results_class : results class or None

The class of results that should be returned. If None this defaults to RegularizedResults.

results_kwds : dict-like or None

Keywords to be passed to results class.

partitions

See Parameters.

Type:

scalar

model_class

See Parameters.

Type:

statsmodels model class

init_kwds

See Parameters.

Type:

dict-like

init_kwds_generator

See Parameters.

Type:

generator or None

estimation_method

See Parameters.

Type:

function

estimation_kwds

See Parameters.

Type:

dict-like

join_method

See Parameters.

Type:

function

join_kwds

See Parameters.

Type:

dict-like

results_class

See Parameters.

Type:

results class

results_kwds

See Parameters.

Type:

dict-like

Notes

Examples

Methods

fit(data_generator[, fit_kwds, ...])

Performs the distributed estimation using the corresponding DistributedModel

fit_joblib(data_generator, fit_kwds, ...[, ...])

Performs the distributed estimation in parallel using joblib

fit_sequential(data_generator, fit_kwds[, ...])

Sequentially performs the distributed estimation using the corresponding DistributedModel