statsmodels.datasets.get_rdataset

statsmodels.datasets.get_rdataset(dataname, package='datasets', cache=False)[source]

download and return R dataset

Parameters:

dataname : str

The name of the dataset you want to download

package : str

The package in which the dataset is found. The default is the core ‘datasets’ package.

cache : bool or str

If True, will download this data into the STATSMODELS_DATA folder. The default location is a folder called statsmodels_data in the user home folder. Otherwise, you can specify a path to a folder to use for caching the data. If False, the data will not be cached.

Returns:

dataset : Dataset instance

A statsmodels.data.utils.Dataset instance. This objects has attributes:

* data - A pandas DataFrame containing the data
* title - The dataset title
* package - The package from which the data came
* from_cache - Whether not cached data was retrieved
* __doc__ - The verbatim R documentation.

Notes

If the R dataset has an integer index. This is reset to be zero-based. Otherwise the index is preserved. The caching facilities are dumb. That is, no download dates, e-tags, or otherwise identifying information is checked to see if the data should be downloaded again or not. If the dataset is in the cache, it’s used.