statsmodels.datasets.get_rdataset¶
-
statsmodels.datasets.
get_rdataset
(dataname, package='datasets', cache=False)[source]¶ download and return R dataset
Parameters: dataname : str
The name of the dataset you want to download
package : str
The package in which the dataset is found. The default is the core ‘datasets’ package.
cache : bool or str
If True, will download this data into the STATSMODELS_DATA folder. The default location is a folder called statsmodels_data in the user home folder. Otherwise, you can specify a path to a folder to use for caching the data. If False, the data will not be cached.
Returns: dataset : Dataset instance
A statsmodels.data.utils.Dataset instance. This objects has attributes:
* data - A pandas DataFrame containing the data * title - The dataset title * package - The package from which the data came * from_cache - Whether not cached data was retrieved * __doc__ - The verbatim R documentation.
Notes
If the R dataset has an integer index. This is reset to be zero-based. Otherwise the index is preserved. The caching facilities are dumb. That is, no download dates, e-tags, or otherwise identifying information is checked to see if the data should be downloaded again or not. If the dataset is in the cache, it’s used.