Introduction
R package corrplot provides a visual exploratory tool on correlation matrix that supports automatic variable reordering to help detect hidden patterns among variables.
corrplot is very easy to use and provides a rich array of plotting options in visualization method, graphic layout, color, legend, text labels, etc. It also provides p-values and confidence intervals to help users determine the statistical significance of the correlations.
corrplot()
has about 50 parameters, however the mostly
common ones are only a few. We can get a correlation matrix plot with
only one line of code in most scenes.
The mostly using parameters include method
,
type
, order
, diag
, and etc.
There are seven visualization methods (parameter method
)
in corrplot package, named 'circle'
, 'square'
,
'ellipse'
, 'number'
, 'shade'
,
'color'
, 'pie'
. Color intensity of the glyph
is proportional to the correlation coefficients by default color
setting.
'circle'
and'square'
, the areas of circles or squares show the absolute value of corresponding correlation coefficients.'ellipse'
, the ellipses have their eccentricity parametrically scaled to the correlation value. It comes from D.J. Murdoch and E.D. Chow’s job, see in section References.'number'
, coefficients numbers with different color.'color'
, square of equal size with different color.'shade'
, similar to'color'
, but the negative coefficients glyphs are shaded. Method'pie'
and'shade'
come from Michael Friendly’s job.'pie'
, the circles are filled clockwise for positive values, anti-clockwise for negative values.
corrplot.mixed()
is a wrapped function for mixed
visualization style, which can set the visual methods of lower and upper
triangular separately.
There are three layout types (parameter type
):
'full'
, 'upper'
and 'lower'
.
The correlation matrix can be reordered according to the correlation matrix coefficients. This is important to identify the hidden structure and pattern in the matrix.
## corrplot 0.94 loaded
Reorder a correlation matrix
The details of four order
algorithms, named
'AOE'
, 'FPC'
, 'hclust'
,
'alphabet'
are as following.
'AOE'
is for the angular order of the eigenvectors. It is calculated from the order of the angles ,where and are the largest two eigenvalues of the correlation matrix. See Michael Friendly (2002) for details.
'FPC'
for the first principal component order.'hclust'
for hierarchical clustering order, and'hclust.method'
for the agglomeration method to be used.'hclust.method'
should be one of'ward'
,'ward.D'
,'ward.D2'
,'single'
,'complete'
,'average'
,'mcquitty'
,'median'
or'centroid'
.'alphabet'
for alphabetical order.
You can also reorder the matrix ‘manually’ via function
corrMatOrder()
.
If using 'hclust'
, corrplot()
can draw
rectangles around the plot of correlation matrix based on the results of
hierarchical clustering.