如何通过列索引dplyr重命名列?

时间:2022-08-05 10:38:52

The following code renames first column in the data set:

以下代码重命名数据集中的第一列:

require(dplyr)    
mtcars %>%
        setNames(c("RenamedColumn", names(.)[2:length(names(.))]))

Desired results:

                    RenamedColumn cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4                    21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag                21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710                   22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1

Would it be possible to arrive at the same result using rename and column index?

是否可以使用重命名和列索引获得相同的结果?

This:

mtcars %>%
    rename(1 = "ChangedNameAgain")

will fail:

Error in source("~/.active-rstudio-document", echo = TRUE) : 
  ~/.active-rstudio-document:7:14: unexpected '='
6: mtcars %>%
7:     rename(1 =
                ^

Similarly trying to use rename_ or .[[1]] as column reference will return an error.

同样,尝试使用rename_或。[[1]]作为列引用将返回错误。

3 个解决方案

#1


17  

As of dplyr 0.7.5, rlang 0.2.1, tidyselect 0.2.4, this simply works:

截至dplyr 0.7.5,rlang 0.2.1,tidyselect 0.2.4,这简单有效:

library(dplyr)

rename(mtcars, ChangedNameAgain = 1)

#                     ChangedNameAgain cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4                       21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag                   21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710                      22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive                  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout               18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
# ...

Original answer and edits now obsolete:

原始答案和编辑现已过时:

The logic of rename() is new_name = old_name, so ChangedNameAgain = 1 would make more sense than 1 = ChangedNameAgain.

rename()的逻辑是new_name = old_name,因此ChangedNameAgain = 1比1 = ChangedNameAgain更有意义。

I would suggest:

我会建议:

mtcars %>% rename_(ChangedNameAgain = names(.)[1])
#                     ChangedNameAgain cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4                       21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag                   21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710                      22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive                  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout               18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
# Valiant                         18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1

Edit

I have yet to wrap my head around the new dplyr programming system based on rlang, since versions 0.6/0.7 of dplyr.

我还没有围绕基于rlang的新dplyr编程系统,因为版本0.6 / 0.7的dplyr。

The underscore-suffixed version of rename used in my initial answer is now deprecated, and per @jzadra's comment, it didn't work anyway with syntactically problematic names like "foo bar".

在我的初始答案中使用的下划线后缀版本的重命名现在已被弃用,并且根据@ jzadra的评论,它无论如何都不能用于像“foo bar”这样的语法上有问题的名称。

Here is my attempt with the new rlang-based Non Standard Evaluation system. Do not hesitate to tell me what I've done wrong, in the comments:

以下是我对新的基于rlang的非标准评估系统的尝试。在评论中,不要犹豫告诉我我做错了什么:

df <- tibble("foo" = 1:2, "bar baz" = letters[1:2])

# # A tibble: 2 x 2
#     foo `bar baz`
#   <int>     <chr>
# 1     1         a
# 2     2         b

First I try directly with rename() but unfortunately I've got an error. It seems to be a FIXME (or is this FIXME unrelated?) in the source code (I'm using dplyr 0.7.4), so it could work in the future:

首先我直接尝试重命名()但不幸的是我有一个错误。它似乎是源代码中的FIXME(或者这个FIXME无关吗?)(我使用的是dplyr 0.7.4),所以它可以在以后工作:

df %>% rename(qux = !! quo(names(.)[[2]]))

# Error: Expressions are currently not supported in `rename()`

(Edit: the error message now (dplyr 0.7.5) reads Error in UseMethod("rename_") : no applicable method for 'rename_' applied to an object of class "function")

(编辑:现在的错误消息(dplyr 0.7.5)读取UseMethod中的错误(“rename_”):没有适用于'rename_'的方法应用于类“function”的对象)

(Update 2018-06-14: df %>% rename(qux = !! quo(names(.)[[2]])) now seems to work, still with dplyr 0.7.5, not sure if an underlying package changed).

(更新2018-06-14:df%>%rename(qux = !! quo(names(。)[[2]]))现在似乎工作,仍然使用dplyr 0.7.5,不确定底层包是否已更改)。

Here is a workaround with select that works. It doesn't preserve column order like rename though:

以下是select的解决方法。它不像重命名那样保留列顺序:

df %>% select(qux = !! quo(names(.)[[2]]), everything())

# # A tibble: 2 x 2
#     qux   foo
#   <chr> <int>
# 1     a     1
# 2     b     2

And if we want to put it in a function, we'd have to slightly modify it with := to allow unquoting on the left hand side. If we want to be robust to inputs like strings and bare variable names, we have to use the "dark magic" (or so says the vignette) of enquo() and quo_name() (honestly I don't fully understand what it does):

如果我们想把它放在一个函数中,我们必须稍微修改它:= =允许在左侧进行取消引用。如果我们想要对字符串和裸变量名这样的输入很健壮,我们必须使用enquo()和quo_name()的“黑暗魔法”(或者说是小插图)(老实说,我并不完全理解它的作用) ):

rename_col_by_position <- function(df, position, new_name) {
  new_name <- enquo(new_name)
  new_name <- quo_name(new_name)
  select(df, !! new_name := !! quo(names(df)[[position]]), everything())
}

This works with new name as a string:

这适用于新名称作为字符串:

rename_col_by_position(df, 2, "qux")

# # A tibble: 2 x 2
#     qux   foo
#   <chr> <int>
# 1     a     1
# 2     b     2

This works with new name as a quosure:

这适用于新名称作为一个quosure:

rename_col_by_position(df, 2, quo(qux))

# # A tibble: 2 x 2
#     qux   foo
#   <chr> <int>
# 1     a     1
# 2     b     2

This works with new name as a bare name:

这适用于新名称作为一个简单的名称:

rename_col_by_position(df, 2, qux)

# # A tibble: 2 x 2
#     qux   foo
#   <chr> <int>
# 1     a     1
# 2     b     2

And even this works:

即便如此:

rename_col_by_position(df, 2, `qux quux`)

# # A tibble: 2 x 2
#   `qux quux`   foo
#        <chr> <int>
# 1          a     1
# 2          b     2

#2


5  

Here's a couple of alternative solutions that are arguably easier to read because they are not focused around the . reference. select understands column indices, so if you're renaming the first column, you can simply do

这里有几个替代解决方案,可以说是更容易阅读,因为它们并没有集中在。参考。 select理解列索引,所以如果你重命名第一列,你可以简单地做

mtcars %>% select( RenamedColumn = 1, everything() )

However, the issue with using select is that it will reorder columns if you're renaming a column in the middle. To get around the issue, you have to pre-select the columns to the left of the one you're renaming:

但是,使用select的问题是,如果要在中间重命名列,它将重新排序列。要解决此问题,您必须预先选择要重命名的列左侧的列:

## This will rename the 7th column without changing column order
mtcars %>% select( 1:6, RenamedColumn = 7, everything() )

Another option is to use the new rename_at, which also understand column indices:

另一种选择是使用新的rename_at,它也可以理解列索引:

## This will also rename the 7th column without changing the order
## Credit for simplifying the second argument: Moody_Mudskipper
mtcars %>% rename_at( 7, ~"RenamedColumn" )

The ~ is needed because rename_at is quite flexible and can accept functions as its second argument. For example, mtcars %>% rename_at( c(2,4), toupper ) will make the names of the second and fourth columns uppercase.

〜是必需的,因为rename_at非常灵活,可以接受函数作为其第二个参数。例如,mtcars%>%rename_at(c(2,4),toupper)将使第二列和第四列的名称为大写。

#3


1  

Imho rlang as suggested by @Aurele is too much here.

@Aurele建议的Imho rlang在这里太多了。

Solution 1: Use a curly bracket pipe pipe context:

解决方案1:使用大括号管道上下文:

bcMatrix %>% {colnames(.)[1] = "foo"; .}

Solution 2: Or (ab)use the tee operator %>% from magrittr package (installed anyway if dplyr is used) to perform the renaming as a side-effect:

解决方案2:或者(ab)使用magrittr包中的tee运算符%>%(如果使用dplyr,则无论如何都安装)来执行重命名作为副作用:

bcMatrix %T>% {colnames(.)[1] = "foo"}

Solution 3: using a simple helper function:

解决方案3:使用简单的辅助函数:

rename_by_pos = function(df, index, new_name){ 
    colnames(df)[index] = new_name 
    df 
}
iris %>% rename_by_pos(2,"foo")

#1


17  

As of dplyr 0.7.5, rlang 0.2.1, tidyselect 0.2.4, this simply works:

截至dplyr 0.7.5,rlang 0.2.1,tidyselect 0.2.4,这简单有效:

library(dplyr)

rename(mtcars, ChangedNameAgain = 1)

#                     ChangedNameAgain cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4                       21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag                   21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710                      22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive                  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout               18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
# ...

Original answer and edits now obsolete:

原始答案和编辑现已过时:

The logic of rename() is new_name = old_name, so ChangedNameAgain = 1 would make more sense than 1 = ChangedNameAgain.

rename()的逻辑是new_name = old_name,因此ChangedNameAgain = 1比1 = ChangedNameAgain更有意义。

I would suggest:

我会建议:

mtcars %>% rename_(ChangedNameAgain = names(.)[1])
#                     ChangedNameAgain cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4                       21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag                   21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710                      22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive                  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout               18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
# Valiant                         18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1

Edit

I have yet to wrap my head around the new dplyr programming system based on rlang, since versions 0.6/0.7 of dplyr.

我还没有围绕基于rlang的新dplyr编程系统,因为版本0.6 / 0.7的dplyr。

The underscore-suffixed version of rename used in my initial answer is now deprecated, and per @jzadra's comment, it didn't work anyway with syntactically problematic names like "foo bar".

在我的初始答案中使用的下划线后缀版本的重命名现在已被弃用,并且根据@ jzadra的评论,它无论如何都不能用于像“foo bar”这样的语法上有问题的名称。

Here is my attempt with the new rlang-based Non Standard Evaluation system. Do not hesitate to tell me what I've done wrong, in the comments:

以下是我对新的基于rlang的非标准评估系统的尝试。在评论中,不要犹豫告诉我我做错了什么:

df <- tibble("foo" = 1:2, "bar baz" = letters[1:2])

# # A tibble: 2 x 2
#     foo `bar baz`
#   <int>     <chr>
# 1     1         a
# 2     2         b

First I try directly with rename() but unfortunately I've got an error. It seems to be a FIXME (or is this FIXME unrelated?) in the source code (I'm using dplyr 0.7.4), so it could work in the future:

首先我直接尝试重命名()但不幸的是我有一个错误。它似乎是源代码中的FIXME(或者这个FIXME无关吗?)(我使用的是dplyr 0.7.4),所以它可以在以后工作:

df %>% rename(qux = !! quo(names(.)[[2]]))

# Error: Expressions are currently not supported in `rename()`

(Edit: the error message now (dplyr 0.7.5) reads Error in UseMethod("rename_") : no applicable method for 'rename_' applied to an object of class "function")

(编辑:现在的错误消息(dplyr 0.7.5)读取UseMethod中的错误(“rename_”):没有适用于'rename_'的方法应用于类“function”的对象)

(Update 2018-06-14: df %>% rename(qux = !! quo(names(.)[[2]])) now seems to work, still with dplyr 0.7.5, not sure if an underlying package changed).

(更新2018-06-14:df%>%rename(qux = !! quo(names(。)[[2]]))现在似乎工作,仍然使用dplyr 0.7.5,不确定底层包是否已更改)。

Here is a workaround with select that works. It doesn't preserve column order like rename though:

以下是select的解决方法。它不像重命名那样保留列顺序:

df %>% select(qux = !! quo(names(.)[[2]]), everything())

# # A tibble: 2 x 2
#     qux   foo
#   <chr> <int>
# 1     a     1
# 2     b     2

And if we want to put it in a function, we'd have to slightly modify it with := to allow unquoting on the left hand side. If we want to be robust to inputs like strings and bare variable names, we have to use the "dark magic" (or so says the vignette) of enquo() and quo_name() (honestly I don't fully understand what it does):

如果我们想把它放在一个函数中,我们必须稍微修改它:= =允许在左侧进行取消引用。如果我们想要对字符串和裸变量名这样的输入很健壮,我们必须使用enquo()和quo_name()的“黑暗魔法”(或者说是小插图)(老实说,我并不完全理解它的作用) ):

rename_col_by_position <- function(df, position, new_name) {
  new_name <- enquo(new_name)
  new_name <- quo_name(new_name)
  select(df, !! new_name := !! quo(names(df)[[position]]), everything())
}

This works with new name as a string:

这适用于新名称作为字符串:

rename_col_by_position(df, 2, "qux")

# # A tibble: 2 x 2
#     qux   foo
#   <chr> <int>
# 1     a     1
# 2     b     2

This works with new name as a quosure:

这适用于新名称作为一个quosure:

rename_col_by_position(df, 2, quo(qux))

# # A tibble: 2 x 2
#     qux   foo
#   <chr> <int>
# 1     a     1
# 2     b     2

This works with new name as a bare name:

这适用于新名称作为一个简单的名称:

rename_col_by_position(df, 2, qux)

# # A tibble: 2 x 2
#     qux   foo
#   <chr> <int>
# 1     a     1
# 2     b     2

And even this works:

即便如此:

rename_col_by_position(df, 2, `qux quux`)

# # A tibble: 2 x 2
#   `qux quux`   foo
#        <chr> <int>
# 1          a     1
# 2          b     2

#2


5  

Here's a couple of alternative solutions that are arguably easier to read because they are not focused around the . reference. select understands column indices, so if you're renaming the first column, you can simply do

这里有几个替代解决方案,可以说是更容易阅读,因为它们并没有集中在。参考。 select理解列索引,所以如果你重命名第一列,你可以简单地做

mtcars %>% select( RenamedColumn = 1, everything() )

However, the issue with using select is that it will reorder columns if you're renaming a column in the middle. To get around the issue, you have to pre-select the columns to the left of the one you're renaming:

但是,使用select的问题是,如果要在中间重命名列,它将重新排序列。要解决此问题,您必须预先选择要重命名的列左侧的列:

## This will rename the 7th column without changing column order
mtcars %>% select( 1:6, RenamedColumn = 7, everything() )

Another option is to use the new rename_at, which also understand column indices:

另一种选择是使用新的rename_at,它也可以理解列索引:

## This will also rename the 7th column without changing the order
## Credit for simplifying the second argument: Moody_Mudskipper
mtcars %>% rename_at( 7, ~"RenamedColumn" )

The ~ is needed because rename_at is quite flexible and can accept functions as its second argument. For example, mtcars %>% rename_at( c(2,4), toupper ) will make the names of the second and fourth columns uppercase.

〜是必需的,因为rename_at非常灵活,可以接受函数作为其第二个参数。例如,mtcars%>%rename_at(c(2,4),toupper)将使第二列和第四列的名称为大写。

#3


1  

Imho rlang as suggested by @Aurele is too much here.

@Aurele建议的Imho rlang在这里太多了。

Solution 1: Use a curly bracket pipe pipe context:

解决方案1:使用大括号管道上下文:

bcMatrix %>% {colnames(.)[1] = "foo"; .}

Solution 2: Or (ab)use the tee operator %>% from magrittr package (installed anyway if dplyr is used) to perform the renaming as a side-effect:

解决方案2:或者(ab)使用magrittr包中的tee运算符%>%(如果使用dplyr,则无论如何都安装)来执行重命名作为副作用:

bcMatrix %T>% {colnames(.)[1] = "foo"}

Solution 3: using a simple helper function:

解决方案3:使用简单的辅助函数:

rename_by_pos = function(df, index, new_name){ 
    colnames(df)[index] = new_name 
    df 
}
iris %>% rename_by_pos(2,"foo")