按行划分的两个数据aframes之间的R相关关系

时间:2021-11-19 22:55:28

I have 2 data frames w/ 5 columns and 100 rows each.

我有2个数据帧w/ 5列和100行。

id       price1      price2     price3     price4     price5
 1         11.22      25.33      66.47      53.76      77.42
 2         33.56      33.77      44.77      34.55      57.42
...

I would like to get the correlation of the corresponding rows, basically

我想要得到相应行之间的相关性

for(i in 1:100){    
cor(df1[i, 1:5], df2[i, 1:5])    
}

but without using a for-loop. I'm assuming there's someway to use plyr to do it but can't seem to get it right. Any suggestions?

但是不用for循环。我假设有一种方法可以用plyr来做,但是似乎做不到。有什么建议吗?

2 个解决方案

#1


20  

Depending on whether you want a cool or fast solution you can use either

取决于你是想要一个酷的还是快的解决方案,你都可以使用

diag(cor(t(df1), t(df2)))

which is cool but wasteful (because it actually computes correlations between all rows which you don't really need so they will be discarded) or

这很酷但很浪费(因为它实际上计算所有行之间的相关性,而这些行实际上并不需要,所以它们会被丢弃)还是

A <- as.matrix(df1)
B <- as.matrix(df2)
sapply(seq.int(dim(A)[1]), function(i) cor(A[i,], B[i,]))

which does only what you want but is a bit more to type.

它只做你想要的,但有点多输入。

#2


4  

I found that as.matrix is not required.

我发现。矩阵不是必需的。

Correlations of all pairs of rows between dataframes df1 and df2:

dataframes df1和df2之间所有行对的相关性:

sapply(1:nrow(df1), function(i) cor(df1[i,], df2[i,]))

and columns:

和列:

sapply(1:ncol(df1), function(i) cor(df1[,i], df2[,i]))

#1


20  

Depending on whether you want a cool or fast solution you can use either

取决于你是想要一个酷的还是快的解决方案,你都可以使用

diag(cor(t(df1), t(df2)))

which is cool but wasteful (because it actually computes correlations between all rows which you don't really need so they will be discarded) or

这很酷但很浪费(因为它实际上计算所有行之间的相关性,而这些行实际上并不需要,所以它们会被丢弃)还是

A <- as.matrix(df1)
B <- as.matrix(df2)
sapply(seq.int(dim(A)[1]), function(i) cor(A[i,], B[i,]))

which does only what you want but is a bit more to type.

它只做你想要的,但有点多输入。

#2


4  

I found that as.matrix is not required.

我发现。矩阵不是必需的。

Correlations of all pairs of rows between dataframes df1 and df2:

dataframes df1和df2之间所有行对的相关性:

sapply(1:nrow(df1), function(i) cor(df1[i,], df2[i,]))

and columns:

和列:

sapply(1:ncol(df1), function(i) cor(df1[,i], df2[,i]))