将因子写入数据帧列表(因子是数据帧的名称)

时间:2021-10-28 22:56:35

I have list1 of dataframes, which i want to do.call("rbind", list1)later, but first i want to add a identifying factor to each of the dataframes. This factor should be the name of the dataframe:

我有list1的数据帧,我想稍后做.call(“rbind”,list1),但首先我要为每个数据帧添加一个识别因子。此因子应该是数据帧的名称:

    list1 <- lapply(vector("list", 6), function(x) 
                    data.frame(replicate(10,sample(0:1,1000,rep=TRUE))))
    names(list1) <- LETTERS[1:6]

e.g. assign "A" to each row of the first dataframe "A" and so on:

例如将“A”分配给第一个数据帧“A”的每一行,依此类推:

list[[1]]$Cat <- "A"
list[[2]]$Cat <- "B" #etc

I tried something like

我尝试过类似的东西

list1 <- lapply(list1, function(x)
                       {list1[[x]]$Cat<- names(list1[[x]]); x})

but failed:

Error in list1[[x]] : invalid subscript type 'list'

How to achieve what i want? Thank you.

如何实现我想要的?谢谢。

1 个解决方案

#1


2  

This can be done easily using tidyverse packages:

这可以使用tidyverse包轻松完成:

library( tidyverse )
imap( list1, ~mutate(.x, Cat = .y) ) %>% bind_rows

To break this down:

打破这个:

  1. imap from purrr package passes every element of the first argument (list1 in this case) along with the element's name to the function you provide in the second argument. By imap's convention, the function can refer to the element using .x and to the element's name using .y.
  2. 来自purrr包的imap将第一个参数的每个元素(在本例中为list1)与元素的名称一起传递给您在第二个参数中提供的函数。根据imap的约定,函数可以使用.x引用元素,使用.y引用元素的名称。

  3. The function in the second argument uses mutate from dplyr package, which creates a new column named Cat.
  4. 第二个参数中的函数使用来自dplyr包的mutate,它创建一个名为Cat的新列。

  5. Lastly, bind_rows is the tidyverse equivalent of do.call( "rbind", list1 ) that you provided in your question.
  6. 最后,bind_rows是你在问题中提供的do.call(“rbind”,list1)的等价的等价物。

EDIT: As joran pointed out in the comments, if your end goal is to concatenate all the data.frames together, bind_rows provides a convenient way to automatically prepend a column identifying the original data.frame:

编辑:正如joran在评论中指出的那样,如果您的最终目标是将所有data.frames连接在一起,bind_rows提供了一种方便的方法来自动添加标识原始data.frame的列:

bind_rows( list1, .id = "Cat" )

#1


2  

This can be done easily using tidyverse packages:

这可以使用tidyverse包轻松完成:

library( tidyverse )
imap( list1, ~mutate(.x, Cat = .y) ) %>% bind_rows

To break this down:

打破这个:

  1. imap from purrr package passes every element of the first argument (list1 in this case) along with the element's name to the function you provide in the second argument. By imap's convention, the function can refer to the element using .x and to the element's name using .y.
  2. 来自purrr包的imap将第一个参数的每个元素(在本例中为list1)与元素的名称一起传递给您在第二个参数中提供的函数。根据imap的约定,函数可以使用.x引用元素,使用.y引用元素的名称。

  3. The function in the second argument uses mutate from dplyr package, which creates a new column named Cat.
  4. 第二个参数中的函数使用来自dplyr包的mutate,它创建一个名为Cat的新列。

  5. Lastly, bind_rows is the tidyverse equivalent of do.call( "rbind", list1 ) that you provided in your question.
  6. 最后,bind_rows是你在问题中提供的do.call(“rbind”,list1)的等价的等价物。

EDIT: As joran pointed out in the comments, if your end goal is to concatenate all the data.frames together, bind_rows provides a convenient way to automatically prepend a column identifying the original data.frame:

编辑:正如joran在评论中指出的那样,如果您的最终目标是将所有data.frames连接在一起,bind_rows提供了一种方便的方法来自动添加标识原始data.frame的列:

bind_rows( list1, .id = "Cat" )