statsmodels.stats.dist_dependence_measures.distance_correlation¶
- statsmodels.stats.dist_dependence_measures.distance_correlation(x, y)[source]¶
Distance correlation.
Calculate the empirical distance correlation as described in [1]. This statistic is analogous to product-moment correlation and describes the dependence between x and y, which are random vectors of arbitrary length. The statistics’ values range between 0 (implies independence) and 1 (implies complete dependence).
- Parameters:
- xarray_like, 1-D or 2-D
If x is 1-D than it is assumed to be a vector of observations of a single random variable. If x is 2-D than the rows should be observations and the columns are treated as the components of a random vector, i.e., each column represents a different component of the random vector x.
- yarray_like, 1-D or 2-D
Same as x, but only the number of observation has to match that of x. If y is 2-D note that the number of columns of y (i.e., the number of components in the random vector) does not need to match the number of columns in x.
- Returns:
float
The empirical distance correlation between x and y.
References
[1]Szekely, G.J., Rizzo, M.L., and Bakirov, N.K. (2007) “Measuring and testing dependence by correlation of distances”. Annals of Statistics, Vol. 35 No. 6, pp. 2769-2794.
Examples
>>> from statsmodels.stats.dist_dependence_measures import ... distance_correlation >>> distance_correlation(np.random.random(1000), np.random.random(1000)) 0.04060497840149489