For lack of a better word, how do I apply a "patch" to a R data.frame? Suppose I have a master database with firm and outlet columns and an ownership shares variable that is 1 or 0 in this example, but could be any percentage.
如果没有更好的词,我如何将“补丁”应用到R数据。假设我有一个具有公司和outlet列的主数据库,以及在这个例子中是1或0的所有权共享变量,但是可以是任何百分比。
// master
firm outlet shares.pre
1 five 1 0
2 one 1 1
3 red 1 0
4 yellow 1 0
5 five 2 0
6 one 2 0
// many more
I want to let firm "one" sell outlet "1" to firm "red", which transaction I have in another data.frame
我想让我在另一个data.frame中的“one”sell outlet“1”to“firm”red”
// delta
firm outlet shares.delta
1 one 1 -1
2 red 1 1
What is the most efficient way in R to apply this "patch" or transaction to my master database? The end result should look like this:
在R中,将这个“补丁”或事务应用到主数据库的最有效的方法是什么?最终结果应该是这样的:
// preferably master, NOT a copy
firm outlet shares.post
1 five 1 0
2 one 1 0 <--- was 1
3 red 1 1 <--- was 0
4 yellow 1 0
5 five 2 0
6 one 2 0
// many more
I am not particular about keeping the suffixes pre
, post
or delta
. If they were all named shares
that would be fine too, I simply want to "add" these data frames.
我并不是特别要保留前缀、post或delta。如果它们都被命名为共享,那也没问题,我只想“添加”这些数据帧。
UPDATE: my current approach is this
更新:我目前的方法是这样的
update <- (master$firm %in% delta$firm) & (master$outlet %in% delta$outlet)
master[update,]$shares <- master[update,]$shares + delta$shares
Yes, I'm aware it does a vector scan to creat the Boolean update
vector, and that the subsetting is also not very efficient. But the thing I don't like about it most is that I have to write out the matching columns.
是的,我知道它会进行向量扫描来创建布尔更新向量,而且子设置也不是很有效。但我最不喜欢的是我必须写出匹配的列。
2 个解决方案
#1
2
Another way using data.table
. Assuming you've loaded both your data in df1
and df2
data.frame
s,
另一种方式使用data.table。假设您已经在df1和df2数据中加载了数据。
require(data.table)
dt1 <- data.table(df1)
dt2 <- data.table(df2)
setkey(dt1, firm, outlet)
setkey(dt2, firm, outlet)
dt1 <- dt2[dt1]
dt1[is.na(dt1)] <- 0
dt1[, shares.post := shares.delta + shares.pre]
# firm outlet shares.delta shares.pre shares.post
# 1: five 1 0 0 0
# 2: five 2 0 0 0
# 3: one 1 -1 1 0
# 4: one 2 0 0 0
# 5: red 1 1 0 1
# 6: yellow 1 0 0 0
#2
1
I'd give a more precise answer if you had provided a reproducible example, but here's one way:
如果你能提供一个可重复的例子,我会给出一个更精确的答案,但有一个方法:
- Call your first data.frame
dat
and your secondchg
- 打电话给你的第一个数据
Then you could merge the two:
然后你可以合并这两个:
dat <- merge(dat,chg)
And just subtract:
就减:
dat$shares <- with(dat, shares.pre + shares.delta )
#1
2
Another way using data.table
. Assuming you've loaded both your data in df1
and df2
data.frame
s,
另一种方式使用data.table。假设您已经在df1和df2数据中加载了数据。
require(data.table)
dt1 <- data.table(df1)
dt2 <- data.table(df2)
setkey(dt1, firm, outlet)
setkey(dt2, firm, outlet)
dt1 <- dt2[dt1]
dt1[is.na(dt1)] <- 0
dt1[, shares.post := shares.delta + shares.pre]
# firm outlet shares.delta shares.pre shares.post
# 1: five 1 0 0 0
# 2: five 2 0 0 0
# 3: one 1 -1 1 0
# 4: one 2 0 0 0
# 5: red 1 1 0 1
# 6: yellow 1 0 0 0
#2
1
I'd give a more precise answer if you had provided a reproducible example, but here's one way:
如果你能提供一个可重复的例子,我会给出一个更精确的答案,但有一个方法:
- Call your first data.frame
dat
and your secondchg
- 打电话给你的第一个数据
Then you could merge the two:
然后你可以合并这两个:
dat <- merge(dat,chg)
And just subtract:
就减:
dat$shares <- with(dat, shares.pre + shares.delta )