I have a large dataframe which is taking me away from my comfort tidyverse
tools. I have one column in my data frame that I need to multiply the others by. How would I do this with data.table
?
我有一个大型数据框,这让我远离了我舒适的tidyverse工具。我的数据框中有一列我需要将其他列乘以。我如何使用data.table执行此操作?
For example I have the following toy data:
例如,我有以下玩具数据:
multiplier a1 a2
1 1 2
2 1 2
3 1 2
And the desired result
并且期望的结果
multiplier a1 a2
1 1 2
2 2 4
3 3 6
I dplyr
I would gather
the a
s then multiply, then finally spread
, but I am getting memory issues. How would I multiply the muliplier
column by each row in data.table
我dplyr我会收集,然后成倍增加,然后最终传播,但我得到记忆问题。我如何将moreullier列乘以data.table中的每一行
2 个解决方案
#1
1
Based on David Arenburg base R can be very fast. Using his example above, you get the same output without installing any libraries:
基于David Arenburg基地R可以非常快。使用上面的示例,您可以获得相同的输出而无需安装任何库:
multiplier = 1:3
a1 = c(1, 1, 1)
a2 = c(2, 2, 2)
data <- data.frame(multiplier,a1,a2)
data1<-data
Option 1
data[,2:3] <- data[,2:3] * data[, 1]
Option 2
data1[,2:nrow(data1)] <- data1[,2:nrow(data1)] * data1[, 1]
Output:
data
data1
multiplier a1 a2
1 1 1 2
2 2 2 4
3 3 3 6
#2
1
You can do this without spreading the data:
你可以这样做而不传播数据:
my_data %>%
mutate_at(c("a1", "a2"), funs(. * multiplier))
# A tibble: 3 x 3
# multiplier a1 a2
# <int> <int> <int>
# 1 1 1 2
# 2 2 2 4
# 3 3 3 6
Data
my_data <- tibble(multiplier = 1:3,
a1 = c(1L, 1L, 1L),
a2 = c(2L, 2L, 2L))
#1
1
Based on David Arenburg base R can be very fast. Using his example above, you get the same output without installing any libraries:
基于David Arenburg基地R可以非常快。使用上面的示例,您可以获得相同的输出而无需安装任何库:
multiplier = 1:3
a1 = c(1, 1, 1)
a2 = c(2, 2, 2)
data <- data.frame(multiplier,a1,a2)
data1<-data
Option 1
data[,2:3] <- data[,2:3] * data[, 1]
Option 2
data1[,2:nrow(data1)] <- data1[,2:nrow(data1)] * data1[, 1]
Output:
data
data1
multiplier a1 a2
1 1 1 2
2 2 2 4
3 3 3 6
#2
1
You can do this without spreading the data:
你可以这样做而不传播数据:
my_data %>%
mutate_at(c("a1", "a2"), funs(. * multiplier))
# A tibble: 3 x 3
# multiplier a1 a2
# <int> <int> <int>
# 1 1 1 2
# 2 2 2 4
# 3 3 3 6
Data
my_data <- tibble(multiplier = 1:3,
a1 = c(1L, 1L, 1L),
a2 = c(2L, 2L, 2L))