statsmodels.stats.correlation_tools.cov_nearest_factor_homog¶
- statsmodels.stats.correlation_tools.cov_nearest_factor_homog(cov, rank)[source]¶
Approximate an arbitrary square matrix with a factor-structured matrix of the form k*I + XX’.
- Parameters:¶
- covarray_like
The input array, must be square but need not be positive semidefinite
- rank
int
The rank of the fitted factor structure
- Returns:¶
A
FactoredPSDMatrix
instance
containing
the
fitted
matrix
Notes
This routine is useful if one has an estimated covariance matrix that is not SPD, and the ultimate goal is to estimate the inverse, square root, or inverse square root of the true covariance matrix. The factor structure allows these tasks to be performed without constructing any n x n matrices.
The calculations use the fact that if k is known, then X can be determined from the eigen-decomposition of cov - k*I, which can in turn be easily obtained form the eigen-decomposition of cov. Thus the problem can be reduced to a 1-dimensional search for k that does not require repeated eigen-decompositions.
If the input matrix is sparse, then cov - k*I is also sparse, so the eigen-decomposition can be done efficiently using sparse routines.
The one-dimensional search for the optimal value of k is not convex, so a local minimum could be obtained.
Examples
Hard thresholding a covariance matrix may result in a matrix that is not positive semidefinite. We can approximate a hard thresholded covariance matrix with a PSD matrix as follows:
>>> import numpy as np >>> np.random.seed(1234) >>> b = 1.5 - np.random.rand(10, 1) >>> x = np.random.randn(100,1).dot(b.T) + np.random.randn(100,10) >>> cov = np.cov(x) >>> cov = cov * (np.abs(cov) >= 0.3) >>> rslt = cov_nearest_factor_homog(cov, 3)