Is it possible to have multiple data frames to be stored into one data structure and process it later by each data frame? i.e. example
是否有可能将多个数据帧存储到一个数据结构中,并在随后的每个数据帧中进行处理?也就是例子
df1 <- data.frame(c(1,2,3), c(4,5,6))
df2 <- data.frame(c(11,22,33), c(44,55,66))
.. then I would like to have them added in a data structure, such that I can loop through that data structure retrieving each data frame one at a time and process it, something like
. .然后,我希望将它们添加到数据结构中,这样我就可以在数据结构中循环,每次检索一个数据帧,并处理它,类似这样
for ( iterate through the data structure) # this gives df1, then df2
{
write data frame to a file
}
I cannot find any such data structure in R. Can anyone point me to any code that illustrates the same functionality?
我在r中找不到任何这样的数据结构,有人能给我指出任何说明相同功能的代码吗?
3 个解决方案
#1
12
Just put the data.frames
in a list. A plus is that a list
works really well with apply
style loops. For example, if you want to save the data.frame's, you can use mapply
:
只需将data.frame放入列表中。另一个优点是,一个列表可以很好地应用于应用样式循环。例如,如果您想保存数据。frame’s,您可以使用mapply:
l = list(df1, df2)
mapply(write.table, x = l, file = c("df1.txt", "df2.txt"))
If you like apply
style loops (and you will, trust me :)) please take a look at the epic plyr
package. It might not be the fastest package (look data.table
for fast), but it drips with syntactic sugar.
如果您喜欢应用样式循环(相信我),请查看epic plyr包。它可能不是最快的包(查看数据)。表为快速),但它滴与语法糖。
#2
8
Lists can be used to hold almost anything, including data.frame
s:
列表可以用来存放几乎任何东西,包括数据。
## Versatility of lists
l <- list(file(), new.env(), data.frame(a=1:4))
For writing out multiple data objects stored in a list, lapply()
is your friend:
对于写出存储在列表中的多个数据对象,lapply()是您的朋友:
ll <- list(df1=df1, df2=df2)
## Write out as *.csv files
lapply(names(ll), function(X) write.csv(ll[[X]], file=paste0(X, ".csv")))
## Save in *.Rdata files
lapply(names(ll), function(X) {
assign(X, ll[[X]])
save(list=X, file=paste0(X, ".Rdata"))
})
#3
5
What you are looking for is a list
. You can use a function like lapply
to treat each of your data frames in the same manner sperately. However, there might be cases where you need to pass your list of data frames to a function that handles the data frames in relation to each other. In this case lapply
doesn't help you.
你要找的是一份清单。您可以使用lapply这样的函数,以相同的方式处理每个数据帧。然而,在某些情况下,您可能需要将您的数据帧列表传递给一个函数,该函数处理相互关联的数据帧。在这种情况下,lapply对您没有帮助。
That's why it is important to note how you can access and iterate the data frames in your list. It's done like this:
这就是为什么要注意如何访问和迭代列表中的数据帧。是这样的:
mylist[[data frame]][row,column]
Note the double brackets around your data frame index. So for your example it would be
注意围绕数据帧索引的双括号。举个例子
df1 <- data.frame(c(1,2,3), c(4,5,6))
df2 <- data.frame(c(11,22,33), c(44,55,66))
mylist<-list(df1,df2)
mylist[[1]][1,2]
would return 4, whereas mylist[1][1,2]
would return NULL. It took a while for me to find this, so I thought it might be helpful to post here.
mylist[[1]][1,2]将返回4,而mylist[1][1,2]将返回NULL。我花了一段时间才找到这个,所以我想在这里发布一下可能会有帮助。
#1
12
Just put the data.frames
in a list. A plus is that a list
works really well with apply
style loops. For example, if you want to save the data.frame's, you can use mapply
:
只需将data.frame放入列表中。另一个优点是,一个列表可以很好地应用于应用样式循环。例如,如果您想保存数据。frame’s,您可以使用mapply:
l = list(df1, df2)
mapply(write.table, x = l, file = c("df1.txt", "df2.txt"))
If you like apply
style loops (and you will, trust me :)) please take a look at the epic plyr
package. It might not be the fastest package (look data.table
for fast), but it drips with syntactic sugar.
如果您喜欢应用样式循环(相信我),请查看epic plyr包。它可能不是最快的包(查看数据)。表为快速),但它滴与语法糖。
#2
8
Lists can be used to hold almost anything, including data.frame
s:
列表可以用来存放几乎任何东西,包括数据。
## Versatility of lists
l <- list(file(), new.env(), data.frame(a=1:4))
For writing out multiple data objects stored in a list, lapply()
is your friend:
对于写出存储在列表中的多个数据对象,lapply()是您的朋友:
ll <- list(df1=df1, df2=df2)
## Write out as *.csv files
lapply(names(ll), function(X) write.csv(ll[[X]], file=paste0(X, ".csv")))
## Save in *.Rdata files
lapply(names(ll), function(X) {
assign(X, ll[[X]])
save(list=X, file=paste0(X, ".Rdata"))
})
#3
5
What you are looking for is a list
. You can use a function like lapply
to treat each of your data frames in the same manner sperately. However, there might be cases where you need to pass your list of data frames to a function that handles the data frames in relation to each other. In this case lapply
doesn't help you.
你要找的是一份清单。您可以使用lapply这样的函数,以相同的方式处理每个数据帧。然而,在某些情况下,您可能需要将您的数据帧列表传递给一个函数,该函数处理相互关联的数据帧。在这种情况下,lapply对您没有帮助。
That's why it is important to note how you can access and iterate the data frames in your list. It's done like this:
这就是为什么要注意如何访问和迭代列表中的数据帧。是这样的:
mylist[[data frame]][row,column]
Note the double brackets around your data frame index. So for your example it would be
注意围绕数据帧索引的双括号。举个例子
df1 <- data.frame(c(1,2,3), c(4,5,6))
df2 <- data.frame(c(11,22,33), c(44,55,66))
mylist<-list(df1,df2)
mylist[[1]][1,2]
would return 4, whereas mylist[1][1,2]
would return NULL. It took a while for me to find this, so I thought it might be helpful to post here.
mylist[[1]][1,2]将返回4,而mylist[1][1,2]将返回NULL。我花了一段时间才找到这个,所以我想在这里发布一下可能会有帮助。