数据帧列表,如何对每个第n个元素进行分组?

时间:2021-08-14 21:07:55

Let's say I have a list of dataframes called old_list:

假设我有一个名为old_list的数据框列表:

#old_list will be a list of length 10

#Create vectors for the dfs (there's probably a concise way to do this). 

old_list<-list(rnorm(10),rnorm(10),rnorm(10),rnorm(10),rnorm(10),rnorm(10),
               rnorm(10),rnorm(10),rnorm(10),rnorm(10))

#Turn old_list into a df; add a second column
library('dplyr')

old_list<-lapply(old_list,function(x) as.data.frame(x)%>%mutate(mu=1))

Ok, now old_list looks something like this:

好的,现在old_list看起来像这样:

   [[1]]
             x mu
1  -0.47734743  1
2   0.28986887  1
3   0.02933248  1
4  -2.15761840  1
5   0.32944305  1
6   0.33237442  1
7  -0.48621491  1
8  -0.61504793  1
9  -1.45353709  1
10 -1.22628027  1

[[2]]
             x mu
1   0.10329026  1
2  -0.43502662  1
3   0.87865194  1
4  -0.37628634  1
5   0.06234334  1
6   0.35441583  1
7   0.46176186  1
8   1.98786158  1
9   1.81183387  1
10  2.18143130  1 .... up to the 10th element

I want to go about grouping every nth df in old_list into a new list called new_list. Let's say I want to group every second df. new_listshould then have a list of length 5 where every element should contain 2 dfs. I've tried code like this:

我想将old_list中的每个nth df分组到一个名为new_list的新列表中。假设我想将每秒df分组。 new_listshould然后有一个长度为5的列表,其中每个元素应包含2个dfs。我试过这样的代码:

new_list<-list()
for (i in 1:seq(1,length(old_list),2)){
  new_list[[i]]<-list(old_list[i:i+1])
  }

But this doesn't 'group' the 1st and 2nd, 3rd and 4th, 5th and 6th... dfs from old_list like I'd like. Any tips?

但是这并没有像我想的那样将old_list中的第1和第2,第3和第4,第5和第6 ...... dfs“分组”。有小费吗?

This is what the first element of new_list should look like (I didn't set a seed so ignore the different values of rnorm(10):

这就是new_list的第一个元素应该是什么样的(我没有设置种子,所以忽略rnorm(10)的不同值:

list(c(old_list[1],old_list[2]))
[[1]]
[[1]][[1]]
             x mu
1   0.56877414  1
2  -2.35897500  1
3   1.16982547  1
4  -0.36609697  1
5   0.53758988  1
6  -1.05709000  1
7  -1.15997033  1
8  -0.07746139  1
9  -0.55179839  1
10 -0.11192844  1

[[1]][[2]]
             x mu
1   0.34540644  1
2  -0.14567340  1
3  -0.56627562  1
4   0.22785077  1
5  -1.73692747  1
6  -1.03707293  1
7  -0.32093204  1
8   0.09449727  1
9   0.41419075  1
10 -0.17093046  1

2 个解决方案

#1


2  

If we need to split up the list and nest elements based on 'n', we use gl to create a grouping variable, split up the 'old_list' and convert it to a tibble

如果我们需要根据'n'拆分列表和嵌套元素,我们使用gl创建一个分组变量,拆分'old_list'并将其转换为tibble

library(tidyverse)
n <- 2
map2(list(old_list), length(old_list),
       ~split(.x, as.integer(gl(.y, n, .y)))) %>% 
      modify_depth(3, ~tibble(x = ., mu = 1))

Or may be this

或者可能是这个

n <- 2
res <- lapply(split(old_list, as.integer(gl(length(old_list), n, 
          length(old_list)))), function(x) 
          lapply(x, function(y) data.frame(x= y, mu = 1)))

res[1]
#$`1`
#$`1`[[1]]
#             x mu
#1   1.11696564  1
#2  -0.32362765  1
#3   0.07355866  1
#4   0.97178378  1
#5   0.55000016  1
#6   0.34958254  1
#7   1.32894403  1
#8  -1.02388909  1
#9   0.48285111  1
#10 -0.55077723  1

#$`1`[[2]]
#            x mu
#1  -0.4506403  1
#2   0.8701737  1
#3   3.3360928  1
#4   1.4608549  1
#5   1.1038983  1
#6   2.3979434  1
#7   0.1652383  1
#8   0.2294786  1
#9   0.2031739  1
#10 -0.4322401  1

#2


2  

Does that work ?

那样有用吗 ?

n=2;
starts <- seq(1,length(old_list),n)
ends   <- unique(c(seq(n,length(old_list),n),length(old_list)))
res <- Map(function(x,y){old_list[x:y]},starts,ends)
length(res)  # 6
lengths(res) # [1] 2 2 2 2 2 1

#1


2  

If we need to split up the list and nest elements based on 'n', we use gl to create a grouping variable, split up the 'old_list' and convert it to a tibble

如果我们需要根据'n'拆分列表和嵌套元素,我们使用gl创建一个分组变量,拆分'old_list'并将其转换为tibble

library(tidyverse)
n <- 2
map2(list(old_list), length(old_list),
       ~split(.x, as.integer(gl(.y, n, .y)))) %>% 
      modify_depth(3, ~tibble(x = ., mu = 1))

Or may be this

或者可能是这个

n <- 2
res <- lapply(split(old_list, as.integer(gl(length(old_list), n, 
          length(old_list)))), function(x) 
          lapply(x, function(y) data.frame(x= y, mu = 1)))

res[1]
#$`1`
#$`1`[[1]]
#             x mu
#1   1.11696564  1
#2  -0.32362765  1
#3   0.07355866  1
#4   0.97178378  1
#5   0.55000016  1
#6   0.34958254  1
#7   1.32894403  1
#8  -1.02388909  1
#9   0.48285111  1
#10 -0.55077723  1

#$`1`[[2]]
#            x mu
#1  -0.4506403  1
#2   0.8701737  1
#3   3.3360928  1
#4   1.4608549  1
#5   1.1038983  1
#6   2.3979434  1
#7   0.1652383  1
#8   0.2294786  1
#9   0.2031739  1
#10 -0.4322401  1

#2


2  

Does that work ?

那样有用吗 ?

n=2;
starts <- seq(1,length(old_list),n)
ends   <- unique(c(seq(n,length(old_list),n),length(old_list)))
res <- Map(function(x,y){old_list[x:y]},starts,ends)
length(res)  # 6
lengths(res) # [1] 2 2 2 2 2 1