statsmodels.stats.anova.AnovaRM¶
- class statsmodels.stats.anova.AnovaRM(data, depvar, subject, within=None, between=None, aggregate_func=None)[source]¶
Repeated measures Anova using least squares regression
The full model regression residual sum of squares is used to compare with the reduced model for calculating the within-subject effect sum of squares [1].
Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.
- Parameters:
- data
DataFrame
- depvar
str
The dependent variable in data
- subject
str
Specify the subject id
- within
list
[str
] The within-subject factors
- between
list
[str
] The between-subject factors, this is not yet implemented
- aggregate_func{
None
, ‘mean’,callable
} If the data set contains more than a single observation per subject and cell of the specified model, this function will be used to aggregate the data before running the Anova. None (the default) will not perform any aggregation; ‘mean’ is s shortcut to numpy.mean. An exception will be raised if aggregation is required, but no aggregation function was specified.
- data
- Returns:
- results
AnovaResults
instance
- results
- Raises:
ValueError
If the data need to be aggregated, but aggregate_func was not specified.
Notes
This implementation currently only supports fully balanced designs. If the data contain more than one observation per subject and cell of the design, these observations need to be aggregated into a single observation before the Anova is calculated, either manually or by passing an aggregation function via the aggregate_func keyword argument. Note that if the input data set was not balanced before performing the aggregation, the implied heteroscedasticity of the data is ignored.
References
Methods
fit
()estimate the model and compute the Anova table