使用apply函数使用file.info填充数据框

时间:2021-03-28 18:37:09

I would like to populate an existing empty dataframe with file information using a list and the file.info function. I've been doing the same task using a for loop, but would like to learn how to use the apply family and thought this would be a nice easy example.

我想使用list和file.info函数填充包含文件信息的现有空数据帧。我一直在使用for循环执行相同的任务,但是想学习如何使用apply系列,并认为这将是一个很好的简单示例。

My list...

listOfFiles_M <- c("I:\\temp\\APIS2//APIS01/WAV/APIS01_20170414_150000.wav", "I:\\temp\\APIS2//APIS01/WAV/APIS01_20170414_160000.wav", 
"I:\\temp\\APIS2//APIS01/WAV/APIS01_20170414_170000.wav", "I:\\temp\\APIS2//APIS01/WAV/APIS01_20170414_180000.wav"
)

My empty dataframe...

我的空数据框......

m_files <- structure(list(size = numeric(0), isdir = logical(0), mode = structure(integer(0), class = "octmode"), 
    mtime = structure(numeric(0), class = c("POSIXct", "POSIXt"
    )), ctime = structure(numeric(0), class = c("POSIXct", "POSIXt"
    )), atime = structure(numeric(0), class = c("POSIXct", "POSIXt"
    )), exe = character(0)), .Names = c("size", "isdir", "mode", 
"mtime", "ctime", "atime", "exe"), row.names = character(0), class = "data.frame")

My function...

test.info <- function(i,x){
  print (i)
  x[i,]=c(file.info(i))
}

And I thought I should use lapply thusly...

而且我认为我应该这样使用lapply ......

lapply(listOfFiles_M, test.info)

And here is an example of what I would like a populated m_files to look like...

这里有一个例子,我希望填充的m_files看起来像......

m_files <- structure(list(rn = c("I:\\temp\\APIS2//APIS01/WAV/APIS01_20170414_150000.wav", 
"I:\\temp\\APIS2//APIS01/WAV/APIS01_20170414_160000.wav", "I:\\temp\\APIS2//APIS01/WAV/APIS01_20170414_170000.wav", 
"I:\\temp\\APIS2//APIS01/WAV/APIS01_20170414_180000.wav"), size = c(9601276, 
9601276, 9601276, 9601276), isdir = c(FALSE, FALSE, FALSE, FALSE
), mode = structure(c(438L, 438L, 438L, 438L), class = "octmode"), 
    mtime = structure(c(1492200300, 1492203900, 1492207500, 1492211100
    ), class = c("POSIXct", "POSIXt")), ctime = structure(c(1537974713.78911, 
    1537974713.85152, 1537974713.89832, 1537974713.92952), class = c("POSIXct", 
    "POSIXt")), atime = structure(c(1537974713.78911, 1537974713.85152, 
    1537974713.89832, 1537974713.92952), class = c("POSIXct", 
    "POSIXt")), exe = c("no", "no", "no", "no")), .Names = c("rn", 
"size", "isdir", "mode", "mtime", "ctime", "atime", "exe"), row.names = c(NA, 
-4L), class = "data.frame")

EDIT: I should have also mentioned that there is a large list, ~200,000 items, so rbind is probably not a good solution.

编辑:我应该也提到有一个大的列表,约200,000项,所以rbind可能不是一个好的解决方案。

2 个解决方案

#1


1  

Simply pass your list of files into file.info which can receive more than 1 value as input and returns a data frame as according to docs, ?file.info.

只需将您的文件列表传递到file.info,该文件可以接收多于1个值作为输入,并根据文档“?file.info”返回数据框。

final_df <- file.info(listOfFiles_M)

No need to initialize an empty data frame and map values to it or rbind iterative returned objects.

无需初始化空数据框并将值映射到它或rbind迭代返回的对象。

#2


0  

I assume the function file.info is designed to take a name of a file and then spit out a vector of length 7 which you use to populate a row.

我假设函数file.info被设计为获取文件的名称,然后吐出长度为7的向量,用于填充行。

Just a recommendation, this is a bit hard to test when we do not have file.info function's output for at least 1 file. So I would recommend simplifying your m_files data frame when you post.

只是一个建议,当我们没有至少1个文件的file.info函数输出时,这有点难以测试。因此,我建议您在发布时简化m_files数据框。

I believe the only issue is that you need to specify the x argument in your lapply.

我相信唯一的问题是你需要在你的lapply中指定x参数。

 lapply(listOfFiles_M, test.info, x = m_files)

the ... argument in apply is for you to list other arugments the function you pass apply may need, in this case it is test.info.

apply中的...参数是为您列出您传递的函数可能需要的其他语句,在这种情况下它是test.info。

#1


1  

Simply pass your list of files into file.info which can receive more than 1 value as input and returns a data frame as according to docs, ?file.info.

只需将您的文件列表传递到file.info,该文件可以接收多于1个值作为输入,并根据文档“?file.info”返回数据框。

final_df <- file.info(listOfFiles_M)

No need to initialize an empty data frame and map values to it or rbind iterative returned objects.

无需初始化空数据框并将值映射到它或rbind迭代返回的对象。

#2


0  

I assume the function file.info is designed to take a name of a file and then spit out a vector of length 7 which you use to populate a row.

我假设函数file.info被设计为获取文件的名称,然后吐出长度为7的向量,用于填充行。

Just a recommendation, this is a bit hard to test when we do not have file.info function's output for at least 1 file. So I would recommend simplifying your m_files data frame when you post.

只是一个建议,当我们没有至少1个文件的file.info函数输出时,这有点难以测试。因此,我建议您在发布时简化m_files数据框。

I believe the only issue is that you need to specify the x argument in your lapply.

我相信唯一的问题是你需要在你的lapply中指定x参数。

 lapply(listOfFiles_M, test.info, x = m_files)

the ... argument in apply is for you to list other arugments the function you pass apply may need, in this case it is test.info.

apply中的...参数是为您列出您传递的函数可能需要的其他语句,在这种情况下它是test.info。