根据其他列中的条件更新一列中的值

时间:2021-04-09 07:57:20

If my data frame (df) looks like this:

如果我的数据框(df)如下所示:

Name        State
John Smith  MI
John Smith  WI
Jeff Smith  WI

I want to rename the John Smith from WI "John Smith1". What is the cleanest R equivalent of the SQL statement?

我想从WI“John Smith1”重命名John Smith。什么是SQL语句中最干净的R等价物?

update df 
set Name = "John Smith1"
where Name = "John Smith"
and State = "WI"

4 个解决方案

#1


15  

df <- data.frame(Name=c('John Smith', 'John Smith', 'Jeff Smith'),
                 State=c('MI','WI','WI'), stringsAsFactors=F)

df <- within(df, Name[Name == 'John Smith' & State == 'WI'] <- 'John Smith1')

> df
         Name State
1  John Smith    MI
2 John Smith1    WI
3  Jeff Smith    WI

#2


7  

One way:

单程:

df[df$Name == "John_Smith" & df$State == "WI", "Name"] <- "John_Smith1"

Another way using the dplyr:

使用dplyr的另一种方法:

df %>% mutate(Name = ifelse(State == "WI" & Name == "John_Smith", "John_Smith1", Name))

Note: As David Arenburg says, the first column should not be a factor. For this, reading the data set stringsAsFactors = FALSE.

注意:正如David Arenburg所说,第一栏不应该是一个因素。为此,读取数据集stringsAsFactors = FALSE。

#3


2  

You can also use package data.table:

您还可以使用包data.table:

library(data.table)
setDT(df)[State=="WI", Name:=paste0(Name,"1")]

#4


0  

As the OP has mentioned that he has "a very big data frame", it might be advantageous to use a binary search

由于OP已经提到他具有“非常大的数据帧”,因此使用二进制搜索可能是有利的

library(data.table)
setDT(DF)[.("John Smith",  "WI"), on = .(Name=V1, State=V2), 
          Name := paste0(Name, 1)][]
          Name State
1:  John Smith    MI
2: John Smith1    WI
3:  Jeff Smith    WI

instead of a vector scan

而不是矢量扫描

setDT(df)[State == "WI" & Name == "John Smith", Name := paste0(Name, "1")]

In both variations the data object is updated by reference, i.e., without copying the whole object which save time and memory.

在两种变型中,数据对象通过引用更新,即,不复制整个对象,这节省了时间和存储器。

#1


15  

df <- data.frame(Name=c('John Smith', 'John Smith', 'Jeff Smith'),
                 State=c('MI','WI','WI'), stringsAsFactors=F)

df <- within(df, Name[Name == 'John Smith' & State == 'WI'] <- 'John Smith1')

> df
         Name State
1  John Smith    MI
2 John Smith1    WI
3  Jeff Smith    WI

#2


7  

One way:

单程:

df[df$Name == "John_Smith" & df$State == "WI", "Name"] <- "John_Smith1"

Another way using the dplyr:

使用dplyr的另一种方法:

df %>% mutate(Name = ifelse(State == "WI" & Name == "John_Smith", "John_Smith1", Name))

Note: As David Arenburg says, the first column should not be a factor. For this, reading the data set stringsAsFactors = FALSE.

注意:正如David Arenburg所说,第一栏不应该是一个因素。为此,读取数据集stringsAsFactors = FALSE。

#3


2  

You can also use package data.table:

您还可以使用包data.table:

library(data.table)
setDT(df)[State=="WI", Name:=paste0(Name,"1")]

#4


0  

As the OP has mentioned that he has "a very big data frame", it might be advantageous to use a binary search

由于OP已经提到他具有“非常大的数据帧”,因此使用二进制搜索可能是有利的

library(data.table)
setDT(DF)[.("John Smith",  "WI"), on = .(Name=V1, State=V2), 
          Name := paste0(Name, 1)][]
          Name State
1:  John Smith    MI
2: John Smith1    WI
3:  Jeff Smith    WI

instead of a vector scan

而不是矢量扫描

setDT(df)[State == "WI" & Name == "John Smith", Name := paste0(Name, "1")]

In both variations the data object is updated by reference, i.e., without copying the whole object which save time and memory.

在两种变型中,数据对象通过引用更新,即,不复制整个对象,这节省了时间和存储器。