使用ggcorrplot通过分类变量绘制多个相关矩阵

I created a simple correlation matrix using the ggcorrplot package and following code:

我使用ggcorrplot包和以下代码创建了一个简单的相关矩阵:

library(ggcorrplot)
corr <- round(cor(data[,18:24], use = "complete.obs"),2)
gg <- ggcorrplot(corr)
print(gg)

What I would like to do is now create multiple correlation matrices using the same data but breaking it out by a categorical variable called "region" (column position '5'): similar to using the facet_wrap function. However, when I try to do that, I get an error. I've tried the following:

我想要做的是现在使用相同的数据创建多个相关矩阵,但是通过名为“region”(列位置“5”)的分类变量将其分解:类似于使用facet_wrap函数。但是,当我尝试这样做时,我收到一个错误。我尝试过以下方法:

library(ggcorrplot)
corr <- round(cor(data[,18:24], use = "complete.obs"),2)
gg <- ggcorrplot(corr) +
facet_wrap("region", ncol = 2)
print(gg)

The error I get is "Error in combine_vars(data, params$plot_env, vars, drop = params$drop) : At least one layer must contain all variables used for facetting"

我得到的错误是“combine_vars中的错误(数据,参数$ plot_env,vars,drop = params $ drop):至少一个图层必须包含用于刻面的所有变量”

I understand that 'corr' is not referencing the "region" field, and I was wondering how I can accomplish this. So basically, the output would be 6 correlation matrices separated by "region" instead of just one correlation matrix for all of the data.

我知道'corr'没有引用“region”字段,我想知道如何才能实现这一目标。所以基本上,输出将是由“区域”分隔的6个相关矩阵,而不是所有数据的仅一个相关矩阵。

1 个解决方案

#1

This probably isn't possible using ggcorrplot, which takes as its input a correlation matrix and melts it into a suitable dataframe that is then used for some particular ggplot stuff to make the plot.

使用ggcorrplot可能无法做到这一点,ggcorrplot将相关矩阵作为输入并将其融合到一个合适的数据帧中,然后将其用于某些特定的ggplot内容以制作绘图。

But you could use the ggcorrplot source code to get what you want.

但是你可以使用ggcorrplot源代码来获得你想要的东西。

As a preliminary step, let's look at a "melted" correlation matrix.

作为初步步骤,让我们看一下“融化”的相关矩阵。

(small_cor <- cor(replicate(2, rnorm(25))))
#>            [,1]       [,2]
#> [1,] 1.00000000 0.06064063
#> [2,] 0.06064063 1.00000000
(reshape2::melt(small_cor))
#>   Var1 Var2      value
#> 1    1    1 1.00000000
#> 2    2    1 0.06064063
#> 3    1    2 0.06064063
#> 4    2    2 1.00000000

It's a dataframe version of a correlation matrix where each row is the correlation for a combination of variables from the original data. The

它是相关矩阵的数据帧版本,其中每一行是来自原始数据的变量组合的相关性。该

Now we'll get down to work with some sample data. There are 6 regions and 7 variables.

现在我们将开始使用一些示例数据。共有6个区域和7个变量。

library(tidyverse)
library(reshape2)

my_data <- data.frame(region = factor(rep(1:6, each = 25)),
                      replicate(7, rnorm(6*25)))

We need the melted correlation matrices with the region IDs. Here's how I did it. There might be a nicer way. I think this might be the trickiest thing you'll have to do.

我们需要融合的相关矩阵和区域ID。这是我如何做到的。可能有更好的方式。我认为这可能是你必须要做的最棘手的事情。

my_cors <- cbind(region = factor(rep(levels(my_data$region), each = 7^2)),
              do.call(rbind, lapply(split(my_data, my_data$region), function(x) melt(cor(x[,-1])))))

Now I will copy and paste from ggcorrplot source code. First, pasted from the argument list to get some defaults:

现在我将从ggcorrplot源代码中复制并粘贴。首先,从参数列表中粘贴以获取一些默认值:

ggtheme = ggplot2::theme_minimal
colors = c("blue", "white", "red")
outline.color = "gray"
legend.title = "Corr"
tl.cex = 12
tl.srt = 45

Now I cut and paste the relevant parts of ggcorrplot and stick a facet_wrap at the end to get what you wanted.

现在我剪切并粘贴ggcorrplot的相关部分,并在最后粘贴facet_wrap以获得您想要的效果。

my_cors %>% 
  ggplot(aes(Var1, Var2, fill = value)) + 
  geom_tile(color = outline.color) + 
  scale_fill_gradient2(low = colors[1], 
                       high = colors[3], 
                       mid = colors[2], 
                       midpoint = 0,
                       limit = c(-1, 1), 
                       space = "Lab", 
                       name = legend.title) + 
  ggtheme() + theme(axis.text.x = element_text(angle = tl.srt,
                                               vjust = 1, 
                                               size = tl.cex, hjust = 1), 
                    axis.text.y = ggplot2::element_text(size = tl.cex)) + 
  coord_fixed() +
  facet_wrap("region", ncol=2)

#1