如何修改数据框的一些但不是所有变量?

时间:2022-02-28 22:55:50

Suppose there is a data.frame where some variables are coded as integers:

假设有一个data.frame,其中一些变量被编码为整数:

a <- c(1,2,3,4,5)
b <- as.integer(c(2,3,4,5,6))
c <- as.integer(c(5,1,0,9,2))
d <- as.integer(c(5,6,7,3,1))
e <- c(2,6,1,2,3)

df <- data.frame(a,b,c,d,e)
str(df)

Suppose I want to convert columns b to d to numeric:

假设我想将列b转换为d到数字:

varlist <- names(df)[2:4]

lapply(varlist, function(x) {
df$x <- as.numeric(x, data=x)
    })

str(df)

does not work.

不起作用。

I tried:

我试过了:

df$b <- as.numeric(b, data=df)
df$c <- as.numeric(c, data=df)
df$d <- as.numeric(d, data=df)
str(df)

which works fine.

哪个工作正常。

Questions: How do I do this (in a loop or better with lapply, [but I'm a Stata person and as such used to writing loops])?
And more generally: how do I apply any function to a list of variables in a data.frame
(e.g. multiply each variable on the list with some other variable[which is always stays the same,
BONUS: or changes with each variable on the list])?

问题:我如何做到这一点(循环或更好地与lapply,[但我是Stata人,因此习惯于写循环])?更一般地说:我如何将任何函数应用于data.frame中的变量列表(例如,将列表中的每个变量与一些其他变量相乘[它总是保持不变,BONUS:或者随列表中的每个变量而变化) ])?

2 个解决方案

#1


1  

For the first question you can use sapply:

对于第一个问题,您可以使用sapply:

df[2:4] <- sapply(df[2:4],as.numeric)

for the second you should use mapply. For example to multiply the 3 variables(2 to 4) by some 3 different random scalars:

对于第二个你应该使用mapply。例如,将3个变量(2到4)乘以3个不同的随机标量:

df[2:4] <-  mapply(function(x,y)df[[x]]*y,2:4,rnorm(3))

#2


0  

df[,2:4] <- sapply(df[,2:4], as.numeric)

As for your second question, if you want to say multiply column c by 5

至于你的第二个问题,如果你想说列c乘以5

df$c <- df$c * 5

Or any vector the same length as c, maybe a new column multiplying c by d

或者任何与c相同长度的向量,可能是一个新的列乘以c乘以d

df$cd <- df$c * df$d

#1


1  

For the first question you can use sapply:

对于第一个问题,您可以使用sapply:

df[2:4] <- sapply(df[2:4],as.numeric)

for the second you should use mapply. For example to multiply the 3 variables(2 to 4) by some 3 different random scalars:

对于第二个你应该使用mapply。例如,将3个变量(2到4)乘以3个不同的随机标量:

df[2:4] <-  mapply(function(x,y)df[[x]]*y,2:4,rnorm(3))

#2


0  

df[,2:4] <- sapply(df[,2:4], as.numeric)

As for your second question, if you want to say multiply column c by 5

至于你的第二个问题,如果你想说列c乘以5

df$c <- df$c * 5

Or any vector the same length as c, maybe a new column multiplying c by d

或者任何与c相同长度的向量,可能是一个新的列乘以c乘以d

df$cd <- df$c * df$d