how to install ggcorrplot?
In order to do this, we will install a package called ggcorrplot package. With the help of this package, we can easily visualize a correlation matrix. We can also compute a matrix of correlation p-values by using a function that is present in this package. The corr_pmat() is used for computing the correlation matrix of p-values and the ggcorrplot() is used for displaying the correlation matrix using ggplot.
Syntax :
Where x is the dataframe or the matrix
Syntax:
We will first install and load the ggcorrplot and ggplot2 package using the install.packages() to install and library() to load the package. We need a dataset to construct our correlation matrix and then visualize it. We will create our correlation matrix with the help of cor() function, which computes the correlation coefficient. After computing the correlation matrix, we will compute the matrix of correlation p-values using the corr_pmat() function. Next, we will visualize the correlation matrix with the help of ggcorrplot() function using ggplot2.
We will take a sample dataset for explaining our approach better. We will take the inbuilt USArrests dataset, and we will visualize its correlation matrix following the above approach. We will read the data using the data() function, and we will create the correlation matrix with the help of cor() function to compute the correlation coefficient. The round() function is used to round off the values to a specific decimal value. We will use cor_pmat() function to compute the correlation matrix with p-values.
Example: Creating a correlation matrix
Output :
Now since we have a correlation matrix and the correlation matrix with p-values, we will now try to visualize this correlation matrix. The first visualization is to use the ggcorrplot() function and plot our correlation matrix in the form of the square and circle method.
Example: Visualizing the correlation matrix using different methods
Output :
Example: Visualizing correlation matrix using different layouts
Output :
We will now visualize our correlation matrix by reordering the matrix using hierarchical clustering. We will do this using the ggcorrplot function with correlation matrix, hc.order, outline.color as arguments.
Example: Reordering of the correlation matrix
Output :
We will now visualize our correlation matrix by adding the correlation coefficient using the ggcorrplot function and providing correlation matrix, hc.order, type, and lower variables as arguments.
Example: Introducing correlation coefficient
Output :
Basically, the significance level is denoted by alpha. We compare the significance level to p-values to check whether the correlation between variables is significant or not. If p-value is less than equal to alpha, then the correlation is significant else, non-significant.
We will visualize our correlation matrix by adding significance level not taking any significant coefficient. We will do this using the ggcorrplot function and taking arguments as our correlation matrix, hc.order, type, and our correlation matrix with p-values.
Example: Adding coefficient significance level
Output :
We will now visualize our correlation matrix by leaving a blank where there is no significance level. In the previous example, we added a significance level to our correlation matrix. Here, we will remove those parts of the correlation matrix where we did not find any significance level.
We will do this using the ggcorrplot function and take arguments like our correlation matrix, correlation matrix with p-values, hc.order, type and insig.
Example: Leaving blank on no significance level
Output :