Suppose there are many data frames that need the same operation performed on them. For example:
假设有许多数据帧需要对它们执行相同的操作。例如:
prefix <- c("Mrs.","Mrs.","Mr","Dr.","Mrs.","Mr.","Mrs.","Ms","Ms","Mr")
measure <- rnorm(10)
df1 <- data.frame(prefix,measure)
df1$gender[df1$prefix=="Mrs."] <- "F"
Would create an indicator variable called gender when the value in the adjacent row was "Mrs.". A general way to loop over string variables in R was adapted from here with the function as.name()
added to remove the quotes from "i":
当相邻行中的值为“Mrs.”时,将创建一个名为gender的指示符变量。在R中循环字符串变量的一般方法是从这里改编,添加函数as.name()以从“i”中删除引号:
dflist <- c("df1","df2","df3","df4","df5")
for (i in dflist) {
as.name(i)$gender[as.name(i)$prefix=="Ms."] <- "F"
}
Unfortunately this doesn't work. Any suggestions?
不幸的是,这不起作用。有什么建议么?
2 个解决方案
#1
8
Put all your data frames into a list, and then loop/lapply
over them. It'll be much easier on you in the long run.
将所有数据框放入列表中,然后循环/重叠它们。从长远来看,这对你来说会容易得多。
dfList <- list(df1=df1, df2=df2, ....)
dfList <- lapply(dfList, function(df) {
df$gender[df$prefix == "Mrs."] <- "F"
df
})
dfList$df1
#2
2
The single instance example would not really create an indicator in the usual sense since the non-"F" values would be <NA>
and those would not work well within R functions. Both arithmetic operations and logical operations will return . Try this instead:
单个实例示例实际上不会创建通常意义上的指示符,因为非“F”值将是
df1$gender <- ifelse(prefix %in% c("Mrs.", "Ms") , "probably F",
ifelse( prefix=="Dr.", "possibly F", # as is my wife.
"probably not F"))
Then follow @HongDoi's advice to use lists. And do not forget to a) return a full dataframe-object , and b) assign the result to an object name (both of which were illustrated but often forgotten by R-newbs.)
然后按照@ HongDoi的建议使用列表。并且不要忘记a)返回一个完整的dataframe-object,并且b)将结果分配给一个对象名称(两者都被说明但经常被R-newbs遗忘。)
#1
8
Put all your data frames into a list, and then loop/lapply
over them. It'll be much easier on you in the long run.
将所有数据框放入列表中,然后循环/重叠它们。从长远来看,这对你来说会容易得多。
dfList <- list(df1=df1, df2=df2, ....)
dfList <- lapply(dfList, function(df) {
df$gender[df$prefix == "Mrs."] <- "F"
df
})
dfList$df1
#2
2
The single instance example would not really create an indicator in the usual sense since the non-"F" values would be <NA>
and those would not work well within R functions. Both arithmetic operations and logical operations will return . Try this instead:
单个实例示例实际上不会创建通常意义上的指示符,因为非“F”值将是
df1$gender <- ifelse(prefix %in% c("Mrs.", "Ms") , "probably F",
ifelse( prefix=="Dr.", "possibly F", # as is my wife.
"probably not F"))
Then follow @HongDoi's advice to use lists. And do not forget to a) return a full dataframe-object , and b) assign the result to an object name (both of which were illustrated but often forgotten by R-newbs.)
然后按照@ HongDoi的建议使用列表。并且不要忘记a)返回一个完整的dataframe-object,并且b)将结果分配给一个对象名称(两者都被说明但经常被R-newbs遗忘。)