statsmodels.sandbox.regression.try_ols_anova.form2design

statsmodels.sandbox.regression.try_ols_anova.form2design(ss, data)[source]

convert string formula to data dictionary

ssstr
  • I : add constant

  • varname : for simple varnames data is used as is

  • F:varname : create dummy variables for factor varname

  • P:varname1*varname2 : create product dummy variables for varnames

  • G:varname1*varname2 : create product between factor and continuous variable

datadict or structured array

data set, access of variables by name as in dictionaries

Returns:
varsdictionary

dictionary of variables with converted dummy variables

nameslist

list of names, product (P:) and grouped continuous variables (G:) have name by joining individual names sorted according to input

Notes

with sorted dict, separate name list would not be necessary

Examples

>>> xx, n = form2design('I a F:b P:c*d G:c*f', testdata)
>>> xx.keys()
['a', 'b', 'const', 'cf', 'cd']
>>> n
['const', 'a', 'b', 'cd', 'cf']