statsmodels.stats.dist_dependence_measures.distance_statistics¶
-
statsmodels.stats.dist_dependence_measures.
distance_statistics
(x, y, x_dist=None, y_dist=None)[source]¶ Calculate various distance dependence statistics.
Calculate several distance dependence statistics as described in [1].
- Parameters
- xarray_like, 1-D or 2-D
If x is 1-D than it is assumed to be a vector of observations of a single random variable. If x is 2-D than the rows should be observations and the columns are treated as the components of a random vector, i.e., each column represents a different component of the random vector x.
- yarray_like, 1-D or 2-D
Same as x, but only the number of observation has to match that of x. If y is 2-D note that the number of columns of y (i.e., the number of components in the random vector) does not need to match the number of columns in x.
- x_distarray_like, 2-D,
optional
A square 2-D array_like object whose values are the euclidean distances between x’s rows.
- y_distarray_like, 2-D,
optional
A square 2-D array_like object whose values are the euclidean distances between y’s rows.
- Returns
collections.namedtuple
A named tuple of distance dependence statistics (DistDependStat) with the following values:
test_statistic : float - The “basic” test statistic (i.e., the one used when the emp method is chosen when calling
distance_covariance_test()
distance_correlation : float - The distance correlation between x and y.
distance_covariance : float - The distance covariance of x and y.
dvar_x : float - The distance variance of x.
dvar_y : float - The distance variance of y.
S : float - The mean of the euclidean distances in x multiplied by those of y. Mostly used internally.
References
- 1
Szekely, G.J., Rizzo, M.L., and Bakirov, N.K. (2007) “Measuring and testing dependence by correlation of distances”. Annals of Statistics, Vol. 35 No. 6, pp. 2769-2794.
Examples
>>> from statsmodels.stats.dist_dependence_measures import ... distance_statistics >>> distance_statistics(np.random.random(1000), np.random.random(1000)) DistDependStat(test_statistic=0.07948284320205831, distance_correlation=0.04269511890990793, distance_covariance=0.008915315092696293, dvar_x=0.20719027438266704, dvar_y=0.21044934264957588, S=0.10892061635588891)