Changing values in one column based on another in R

时间:2021-10-30 07:54:15

So I am using R and trying to change values in a data frame in one column by comparing two columns together. I have something like

所以我使用R并尝试通过比较两列来更改一列中数据框中的值。我有类似的东西

Median   MyPrice
10       0
20       18
20       20
30       35
15       NA

And I would like to say something like

我想说点什么

if(MyPrice == 0 & MyPrice < Median){MyPrice <- 1
  }else if (MyPrice == Median){MyPrice <- 2
  }else if (MyPrice > Median){MyPrice <- 3
  }else {MyPrice <- 4}

To come up with

想出来

Median   MyPrice
10       1
20       1
20       2
30       3
15       4

But there is always an error. I have also tried something like

但总有一个错误。我也尝试过类似的东西

for(i in MyPrice){if(MyPrice == 0 & MyPrice < Median){MyPrice <- 1
  }else if (MyPrice == Median){MyPrice <- 2
  }else if (MyPrice > Median){MyPrice <- 3
  }else {MyPrice <- 4}
  }

The for loop runs but it changes all values in MyPrice to 4. I also tried the ifelse() function but it seemed to have an issue taking that many arguments at once.

for循环运行,但它将MyPrice中的所有值都更改为4.我也尝试了ifelse()函数,但它似乎有一个问题,一次采用那么多参数。

I would also not be opposed to a new column being added to the end of the data frame if a solution like that is easier.

如果像这样的解决方案更容易,我也不会反对将新列添加到数据框的末尾。

2 个解决方案

#1


1  

You don't necessarily have to use a for loop. Start by setting every comparison to 4.

您不一定要使用for循环。首先将每个比较设置为4。

> x$Comp=4
> x$Comp[x$Median>x$MyPrice]=1 #if Median is higher, comparison = 1
> x$Comp[x$Median==x$MyPrice]=2 #if Median is equal to MyPrice, comparison = 2
> x$Comp[x$Median<x$MyPrice]=3 #if Median is lower, comparison = 3
> x
  Median MyPrice Comp
1     10       0    1
2     20      18    1
3     20      20    2
4     30      35    3
5     15      NA    4

#2


1  

Given your first argument that if MyPrice == 0 & MyPrice < Median, your 2nd row where Median: 20 and MyPrice: 18 should also be 4. Here is a working nested ifelse statement with an NA handler after.

鉴于你的第一个论点,如果MyPrice == 0&MyPrice ,你的第二行median:20和myprice:18也应该是4.这是一个带有na处理程序的工作嵌套ifelse语句。

df <- as.data.frame(matrix(c(10,0,20,18,20,20,30,35,15,NA), byrow = T, ncol = 2))
colnames(df) <- c("Median","MyPrice")

df$NewPrice <- ifelse(df$MyPrice == 0 & df$MyPrice < df$Median, 1, 
                      ifelse(df$MyPrice == df$Median, 2, 
                             ifelse(df$MyPrice > df$Median, 3, 4)))
df$NewPrice[is.na(df$MyPrice)] <- 4
df
#  Median MyPrice NewPrice
#1     10       0        1
#2     20      18        4
#3     20      20        2
#4     30      35        3
#5     15      NA        4

#1


1  

You don't necessarily have to use a for loop. Start by setting every comparison to 4.

您不一定要使用for循环。首先将每个比较设置为4。

> x$Comp=4
> x$Comp[x$Median>x$MyPrice]=1 #if Median is higher, comparison = 1
> x$Comp[x$Median==x$MyPrice]=2 #if Median is equal to MyPrice, comparison = 2
> x$Comp[x$Median<x$MyPrice]=3 #if Median is lower, comparison = 3
> x
  Median MyPrice Comp
1     10       0    1
2     20      18    1
3     20      20    2
4     30      35    3
5     15      NA    4

#2


1  

Given your first argument that if MyPrice == 0 & MyPrice < Median, your 2nd row where Median: 20 and MyPrice: 18 should also be 4. Here is a working nested ifelse statement with an NA handler after.

鉴于你的第一个论点,如果MyPrice == 0&MyPrice ,你的第二行median:20和myprice:18也应该是4.这是一个带有na处理程序的工作嵌套ifelse语句。

df <- as.data.frame(matrix(c(10,0,20,18,20,20,30,35,15,NA), byrow = T, ncol = 2))
colnames(df) <- c("Median","MyPrice")

df$NewPrice <- ifelse(df$MyPrice == 0 & df$MyPrice < df$Median, 1, 
                      ifelse(df$MyPrice == df$Median, 2, 
                             ifelse(df$MyPrice > df$Median, 3, 4)))
df$NewPrice[is.na(df$MyPrice)] <- 4
df
#  Median MyPrice NewPrice
#1     10       0        1
#2     20      18        4
#3     20      20        2
#4     30      35        3
#5     15      NA        4