如何为多个数据帧将列名设置为小写?

时间:2021-11-14 22:57:57

I have a set of dataframes with the same column headings, except that some of the column names are in upper case and some are in lower case. I want to convert all the column names to lowercase so that I can make one big dataframe of everything.

我有一组具有相同列标题的数据框,除了一些列名称大写,一些列小写。我想将所有列名称转换为小写,以便我可以制作一个大的数据帧。

I can't seem to get colnames() to work in any loop or apply I write. With:

我似乎无法让colnames()在任何循环中工作或应用我写。附:

#create dfs
df1<-data.frame("A" = 1:10, "B" = 2:11)
df2<-data.frame("a" = 3:12, "b" = 4:13)
df3<-data.frame("a" = 5:14, "b" = 6:15)
#I have many more dfs in my actual data

#make list of dfs, define lowercasing function, apply across df list
dfs<-ls(pattern = "df")
lowercols<-function(df){colnames(get(df))<-tolower(colnames(get(df)))}
lapply(dfs, lowercols)

I get the following error:

我收到以下错误:

Error in colnames(get(df)) <- tolower(colnames(get(df))) : 
  could not find function "get<-"

How do I change all my dataframes to have lowercase column names?

如何将所有数据框更改为具有小写列名?

3 个解决方案

#1


8  

The following should work:

以下应该有效:

dfList <- lapply(lapply(dfs,get),function(x) {colnames(x) <- tolower(colnames(x));x})

Problems like this generally stem from the fact that you haven't placed all your data frames in a single data structure, and then are forced to use something awkward, like get.

像这样的问题通常源于你没有将所有数据框放在单个数据结构中,然后*使用尴尬的东西,比如get。

Not that in my code, I use lapply and get to actually create a single list of data frames first, and then alter their colnames.

不是在我的代码中,我使用lapply并首先实际创建一个数据框列表,然后更改它们的colnames。

You should also be aware that your lowercols function is rather un-R like. R functions generally aren't called in such a way that they return nothing, but have side effects. If you try to write functions that way (which is possible) you will probably make your life difficult and have scoping issues. Note that in my second lapply I explicitly return the modified data frame.

您还应该知道您的lowercols功能非常类似于R。 R函数通常不会以不返回任何内容的方式调用,但会产生副作用。如果你试图以这种方式编写函数(这是可能的),你可能会让你的生活变得困难并且存在范围问题。请注意,在我的第二个lapply中,我显式返回修改后的数据框。

#2


4  

@joran's answer overlaps mine heavily, both in style and in "you probably want to do this differently" message. However, in the spirit of "give a man a fish and you feed him for a day; give him a sharp stick, and he can poke himself in the eye" ...

@joran的回答在很大程度上重复了我的风格和“你可能想要以不同的方式做这些”的信息。然而,本着“给人一条鱼,你喂他一天;给他一根尖锐的棍子,他可以戳自己的眼睛”的精神......

Here's a function that does what you want in the way that (you think) you want to do it:

这是一个按照你想要的方式做你想做的事情:

dfnames <- ls(pattern = "df[0-9]+")  ## avoid 'dfnames' itself
lowercolnames <- function(df) {
    x <- get(df)
    colnames(x) <- tolower(colnames(x))
    ## normally I would use parent.frame(), but here we
    ##  have to go back TWO frames if this is used within lapply()
    assign(df,x,sys.frame(-2))
    ## OR (maybe simpler)
    ## assign(df,x,envir=.GlobalEnv)

    NULL
}

Here are two alternate functions that lowercase column names and return the result:

以下是两个备用函数,它们小写列名并返回结果:

lowerCN2 <- function(x) {
    colnames(x) <- tolower(colnames(x))
    x
}

I include plyr::rename here for completeness, although in this case it's actually more trouble than it's worth.

我在这里包括plyr :: rename的完整性,虽然在这种情况下它实际上比它的价值更麻烦。

lowerCN3 <- function(x) {
    plyr::rename(x,structure(tolower(colnames(x)),
                             names=colnames(x)))
}

dflist <- lapply(dfnames,get)
dflist <- lapply(dflist,lowerCN2)
dflist <- lapply(dflist,lowerCN3)

#3


1  

This doesn't directly answer your question, but it may solve the problem you're trying to solve; you can merge data.frames by different names via something like:

这不能直接回答你的问题,但它可以解决你试图解决的问题;你可以通过不同的名称合并data.frames,例如:

df1 <- data.frame("A" = 1:10, "B" = 2:11, x=letters[1:10])
df2 <- data.frame("a" = 3:12, "b" = 4:13, y=LETTERS[1:10])
merge(df1, df2, by.x=c("A","B"), by.y=c("a","b"), all=TRUE)

#1


8  

The following should work:

以下应该有效:

dfList <- lapply(lapply(dfs,get),function(x) {colnames(x) <- tolower(colnames(x));x})

Problems like this generally stem from the fact that you haven't placed all your data frames in a single data structure, and then are forced to use something awkward, like get.

像这样的问题通常源于你没有将所有数据框放在单个数据结构中,然后*使用尴尬的东西,比如get。

Not that in my code, I use lapply and get to actually create a single list of data frames first, and then alter their colnames.

不是在我的代码中,我使用lapply并首先实际创建一个数据框列表,然后更改它们的colnames。

You should also be aware that your lowercols function is rather un-R like. R functions generally aren't called in such a way that they return nothing, but have side effects. If you try to write functions that way (which is possible) you will probably make your life difficult and have scoping issues. Note that in my second lapply I explicitly return the modified data frame.

您还应该知道您的lowercols功能非常类似于R。 R函数通常不会以不返回任何内容的方式调用,但会产生副作用。如果你试图以这种方式编写函数(这是可能的),你可能会让你的生活变得困难并且存在范围问题。请注意,在我的第二个lapply中,我显式返回修改后的数据框。

#2


4  

@joran's answer overlaps mine heavily, both in style and in "you probably want to do this differently" message. However, in the spirit of "give a man a fish and you feed him for a day; give him a sharp stick, and he can poke himself in the eye" ...

@joran的回答在很大程度上重复了我的风格和“你可能想要以不同的方式做这些”的信息。然而,本着“给人一条鱼,你喂他一天;给他一根尖锐的棍子,他可以戳自己的眼睛”的精神......

Here's a function that does what you want in the way that (you think) you want to do it:

这是一个按照你想要的方式做你想做的事情:

dfnames <- ls(pattern = "df[0-9]+")  ## avoid 'dfnames' itself
lowercolnames <- function(df) {
    x <- get(df)
    colnames(x) <- tolower(colnames(x))
    ## normally I would use parent.frame(), but here we
    ##  have to go back TWO frames if this is used within lapply()
    assign(df,x,sys.frame(-2))
    ## OR (maybe simpler)
    ## assign(df,x,envir=.GlobalEnv)

    NULL
}

Here are two alternate functions that lowercase column names and return the result:

以下是两个备用函数,它们小写列名并返回结果:

lowerCN2 <- function(x) {
    colnames(x) <- tolower(colnames(x))
    x
}

I include plyr::rename here for completeness, although in this case it's actually more trouble than it's worth.

我在这里包括plyr :: rename的完整性,虽然在这种情况下它实际上比它的价值更麻烦。

lowerCN3 <- function(x) {
    plyr::rename(x,structure(tolower(colnames(x)),
                             names=colnames(x)))
}

dflist <- lapply(dfnames,get)
dflist <- lapply(dflist,lowerCN2)
dflist <- lapply(dflist,lowerCN3)

#3


1  

This doesn't directly answer your question, but it may solve the problem you're trying to solve; you can merge data.frames by different names via something like:

这不能直接回答你的问题,但它可以解决你试图解决的问题;你可以通过不同的名称合并data.frames,例如:

df1 <- data.frame("A" = 1:10, "B" = 2:11, x=letters[1:10])
df2 <- data.frame("a" = 3:12, "b" = 4:13, y=LETTERS[1:10])
merge(df1, df2, by.x=c("A","B"), by.y=c("a","b"), all=TRUE)