根据一个因子的级别到新数据。帧。

时间:2022-08-16 22:48:12

I'm trying to create separate data.frame objects based on levels of a factor. So if I have:

我尝试创建单独的数据。基于一个因子的级别来创建对象。所以如果我有:

df <- data.frame(
  x=rnorm(25),
  y=rnorm(25),
  g=rep(factor(LETTERS[1:5]), 5)
)

how can I split df into separate data.frames for each level of g containing the corresponding x and y values? I can get most of the way there using split(df, df$g), but I'd like the each level of the factor to have its own data.frame. What's the best way to do this?

我如何将df分割成单独的数据。每一层g包含对应的x和y值?我可以使用split(df, df$g),但是我希望每个层次的因子都有自己的data.frame。最好的方法是什么?

Thanks.

谢谢。

1 个解决方案

#1


71  

I think that split does exactly what you want.

我认为分裂正是你想要的。

Notice that X is a list of data frames, as seen by str:

注意,X是一个数据帧列表,如str所示:

X <- split(df, df$g)
str(X)

If you want individual object with the group g names you could assign the elements of X from split to objects of those names, though this seems like extra work when you can just index the data frames from the list split creates.

如果您想要使用group g名称的单个对象,则可以将X的元素分配给这些名称的对象,不过这似乎是额外的工作,因为您可以从列表拆分创建的数据帧中进行索引。

#I used lapply just to drop the third column g which is no longer needed.
Y <- lapply(seq_along(X), function(x) as.data.frame(X[[x]])[, 1:2]) 

#Assign the dataframes in the list Y to individual objects
A <- Y[[1]]
B <- Y[[2]]
C <- Y[[3]]
D <- Y[[4]]
E <- Y[[5]]

#Or use lapply with assign to assign each piece to an object all at once
lapply(seq_along(Y), function(x) {
    assign(c("A", "B", "C", "D", "E")[x], Y[[x]], envir=.GlobalEnv)
    }
)

Edit Or even better than using lapply to assign to the global environment use list2env:

编辑或甚至比使用lapply分配到全局环境使用list2env更好:

names(Y) <- c("A", "B", "C", "D", "E")
list2env(Y, envir = .GlobalEnv)
A

#1


71  

I think that split does exactly what you want.

我认为分裂正是你想要的。

Notice that X is a list of data frames, as seen by str:

注意,X是一个数据帧列表,如str所示:

X <- split(df, df$g)
str(X)

If you want individual object with the group g names you could assign the elements of X from split to objects of those names, though this seems like extra work when you can just index the data frames from the list split creates.

如果您想要使用group g名称的单个对象,则可以将X的元素分配给这些名称的对象,不过这似乎是额外的工作,因为您可以从列表拆分创建的数据帧中进行索引。

#I used lapply just to drop the third column g which is no longer needed.
Y <- lapply(seq_along(X), function(x) as.data.frame(X[[x]])[, 1:2]) 

#Assign the dataframes in the list Y to individual objects
A <- Y[[1]]
B <- Y[[2]]
C <- Y[[3]]
D <- Y[[4]]
E <- Y[[5]]

#Or use lapply with assign to assign each piece to an object all at once
lapply(seq_along(Y), function(x) {
    assign(c("A", "B", "C", "D", "E")[x], Y[[x]], envir=.GlobalEnv)
    }
)

Edit Or even better than using lapply to assign to the global environment use list2env:

编辑或甚至比使用lapply分配到全局环境使用list2env更好:

names(Y) <- c("A", "B", "C", "D", "E")
list2env(Y, envir = .GlobalEnv)
A