Ok, first of all let me generate some sample data:
好的,首先让我生成一些示例数据:
A_X01 <- c(34, 65, 23, 43, 22)
A_X02 <- c(2, 4, 7, 8, 3)
B_X01 <- c(24, 45, 94, 23, 54)
B_X02 <- c(4, 2, 4, 9, 1)
C_X01 <- c(34, 65, 876, 45, 87)
C_X02 <- c(123, 543, 86, 87, 34)
Var <- c(3, 5, 7, 2, 3)
DF <- data.frame(A_X01, A_X02, B_X01, B_X02, C_X01, C_X02, Var)
What I want to do is apply an equation to the concurrent columns of A and B for both X01 and X02, with a third column "Var" used in the equation.
我想要做的是将方程应用于X01和X02的A和B的并发列,方程中使用第三列“Var”。
So far I have been doing this the following way:
到目前为止,我一直在这样做:
DF$D_X01 <- (DF$A_X01 + DF$B_X01) * DF$Var
DF$D_X02 <- (DF$A_X02 + DF$B_X02) * DF$Var
My desired output is as follows:
我想要的输出如下:
A_X01 A_X02 B_X01 B_X02 C_X01 C_X02 Var D_X01 D_X02
1 34 2 24 4 34 123 3 174 18
2 65 4 45 2 65 543 5 550 30
3 23 7 94 4 876 86 7 819 77
4 43 8 23 9 45 87 2 132 34
5 22 3 54 1 87 34 3 228 12
As you'll appreciate this is a lot of lines of code to do something fairly simple. Meaning at present my scripts are rather long (as I have multiple columns in the actual dataset)!
正如您所理解的那样,要做一些相当简单的事情需要很多代码。目前我的脚本相当长(因为我在实际数据集中有多个列)!
One of the apply functions must be the way to go but I can't seem to get my head around it for concurrent columns. I did think about using lapply but how would I get this to work for the two lists of columns and for the right columns to be added together?
其中一个应用函数必须是可行的方法,但我似乎无法理解并发列。我确实考虑过使用lapply,但是我如何才能将这个列用于两列列并将右列添加到一起?
I've looked around and can't seem to find a way to do this which must be a fairly common problem?
我环顾四周,似乎找不到办法做这个必须是一个相当常见的问题?
Thanks.
谢谢。
EDIT: Original question was a bit confusing so have updated with a desired output and some extra conditions.
编辑:原始问题有点令人困惑,所以更新了所需的输出和一些额外的条件。
2 个解决方案
#1
1
Try this
尝试这个
indx <- gsub("\\D", "", grep("A_X|B_X", names(DF), value = TRUE)) # Retrieving indexes
indx2 <- DF[grep("A_X|B_X", names(DF))] # Considering only the columns of interest
DF[paste0("D_X", unique(indx))] <-
sapply(unique(indx), function(x) rowSums(indx2[which(indx == x)])*DF$Var)
DF
# A_X01 A_X02 B_X01 B_X02 C_X01 C_X02 Var D_X01 D_X02
# 1 34 2 24 4 34 123 3 174 18
# 2 65 4 45 2 65 543 5 550 30
# 3 23 7 94 4 876 86 7 819 77
# 4 43 8 23 9 45 87 2 132 34
# 5 22 3 54 1 87 34 3 228 12
#2
0
You may also try
你也可以试试
indxA <- grep("^A", colnames(DF))
indxB <- grep("^B", colnames(DF))
f1 <- function(x,y,z) (x+y)*z
DF[sprintf('D_X%02d', indxA)] <- Map(f1 , DF[indxA], DF[indxB], list(DF$Var))
DF
# A_X01 A_X02 B_X01 B_X02 C_X01 C_X02 Var D_X01 D_X02
#1 34 2 24 4 34 123 3 174 18
#2 65 4 45 2 65 543 5 550 30
#3 23 7 94 4 876 86 7 819 77
#4 43 8 23 9 45 87 2 132 34
#5 22 3 54 1 87 34 3 228 12
Or you could use mapply
或者你可以使用mapply
DF[sprintf('D_X%02d', indxA)] <- mapply(`+`, DF[indxA],DF[indxB])*DF$Var
#1
1
Try this
尝试这个
indx <- gsub("\\D", "", grep("A_X|B_X", names(DF), value = TRUE)) # Retrieving indexes
indx2 <- DF[grep("A_X|B_X", names(DF))] # Considering only the columns of interest
DF[paste0("D_X", unique(indx))] <-
sapply(unique(indx), function(x) rowSums(indx2[which(indx == x)])*DF$Var)
DF
# A_X01 A_X02 B_X01 B_X02 C_X01 C_X02 Var D_X01 D_X02
# 1 34 2 24 4 34 123 3 174 18
# 2 65 4 45 2 65 543 5 550 30
# 3 23 7 94 4 876 86 7 819 77
# 4 43 8 23 9 45 87 2 132 34
# 5 22 3 54 1 87 34 3 228 12
#2
0
You may also try
你也可以试试
indxA <- grep("^A", colnames(DF))
indxB <- grep("^B", colnames(DF))
f1 <- function(x,y,z) (x+y)*z
DF[sprintf('D_X%02d', indxA)] <- Map(f1 , DF[indxA], DF[indxB], list(DF$Var))
DF
# A_X01 A_X02 B_X01 B_X02 C_X01 C_X02 Var D_X01 D_X02
#1 34 2 24 4 34 123 3 174 18
#2 65 4 45 2 65 543 5 550 30
#3 23 7 94 4 876 86 7 819 77
#4 43 8 23 9 45 87 2 132 34
#5 22 3 54 1 87 34 3 228 12
Or you could use mapply
或者你可以使用mapply
DF[sprintf('D_X%02d', indxA)] <- mapply(`+`, DF[indxA],DF[indxB])*DF$Var