R - 读入然后从二进制img文件列表中提取相同的单元格

时间:2021-05-15 21:26:53

I’m trying to extract the values of the same [i,j] cell from ~14000 img files. I’ve set up a working function that did this for smaller batches where it was reasonable to put the files in my directory, but now that I’m ready to look at the larger dataset I’m stuck. The img files are organized by year, with 365 separate files for each of 38 winters. Each winter has its own folder (WS1978_1979data, WS1979_1980data, etc.), and each day has its own file containing snow depth data for a large satellite grid in the Arctic (ssmi_n_snowdepth_5day_1978307.img, ssmi_n_snowdepth_5day_1978308.img, etc.) starting October 1 and going through September 30 of the following year. My ultimate hope (at least for this stage) is to create a vector of 365 snow depths for the cell of interest and to do this for each year in the dataset.

我试图从~14000 img文件中提取相同[i,j]单元格的值。我已经设置了一个工作函数来为较小的批次执行此操作,将文件放在我的目录中是合理的,但现在我已经准备好查看我遇到的更大的数据集了。 img文件按年组织,38个冬季各有365个独立文件。每个冬天都有自己的文件夹(WS1978_1979数据,WS1979_1980数据等),并且每天都有自己的文件,其中包含10月1日开始的北极大型卫星网格(ssmi_n_snowdepth_5day_1978307.img,ssmi_n_snowdepth_5day_1978308.img等)的雪深数据。经过次年的9月30日。我最终的希望(至少在这个阶段)是为感兴趣的细胞创建365个积雪深度的矢量,并在数据集中为每年做这个。

I can specify the appropriate file path to generate a list of the files I want for a given year, but then when I use my function to extract the particular cell I want, it looks for the file in the directory, which is wrong. Can you help me out? I feel like I must be missing something simple but I haven’t been able to find what I need.

我可以指定适当的文件路径来生成给定年份所需文件的列表,但是当我使用我的函数提取我想要的特定单元格时,它会查找目录中的文件,这是错误的。你能帮我吗?我觉得我必须错过一些简单但我无法找到我需要的东西。

Example of making a list of all the files for the winter of 1979-1980:

制作1979-1980冬季所有文件列表的示例:

w1979s1980 <-  as.vector(list.files(path="SnowDepth/WS1979_1980data", pattern=".img"))`

Function to extract the snow depth from a given cell for all the files in that list:

用于从给定单元格中提取该列表中所有文件的积雪深度的功能:

cell.depthKotz <- function(depthfile){
  depth.val <- c()
  for(i in 1:length(depthfile)) {
  depth.mat <- matrix(readBin(depthfile[i], what="integer", n=136192, size=2, endian="little"), 
                      nrow=448, ncol=304, byrow=TRUE)
  depth.val[i] <- depth.mat[187,65]
  depth.val[depth.val == 110] <- NA
  depth.val[depth.val == 120] <- NA
  depth.val[depth.val == 130] <- NA
  depth.val[depth.val == 140] <- NA
  depth.val[depth.val == 150] <- NA
  depth.val[depth.val == 160] <- NA
  }  
  return(depth.val)
}

And then probably save this as a vector when I run the function for a given year:

当我运行给定年份的函数时,可能会将其保存为向量:

Sdepths1978.1979 <- as.vector(cell.depthKotz(w1979s1980))

I should add that I’m very new to all this as far as even knowing how to phrase what I’m asking for, so let me know if I need to edit the title/question or add more detail. I’m not concerned about runtime if you see that sort of inefficiency in the functions above, but if there are obvious changes that would mean less repetitive/manual effort from me and more automated effort from R feel free to say so. Thanks for your help!

我应该补充一点,即使我知道如何说出我要求的内容,我对这一切都很陌生,所以如果我需要编辑标题/问题或添加更多细节,请告诉我。如果你在上面的函数中看到那种低效率,我并不关心运行时,但如果有明显的变化意味着我的重复性/手动性更少,R的更多自动化工作可以*地说出来。谢谢你的帮助!

1 个解决方案

#1


0  

There is a recursive flag in the list.files function.

list.files函数中有一个递归标志。

files <- list.files(path = "src", pattern = "\\.jpg$", recursive = TRUE)

If you make the path point to the parent directory. And add the recursive = T flag you should be good.

如果使路径指向父目录。并添加recursive = T标志你应该是好的。

Optionally you can change the pattern to end with $ stating the files must end with this pattern. In a rare case where there is another file in the directory named someinfo.img.txt this would be ignored.

您可以选择将模式更改为以$结尾,文件必须以此模式结束。在极少数情况下,名为someinfo.img.txt的目录中有另一个文件,这将被忽略。

#1


0  

There is a recursive flag in the list.files function.

list.files函数中有一个递归标志。

files <- list.files(path = "src", pattern = "\\.jpg$", recursive = TRUE)

If you make the path point to the parent directory. And add the recursive = T flag you should be good.

如果使路径指向父目录。并添加recursive = T标志你应该是好的。

Optionally you can change the pattern to end with $ stating the files must end with this pattern. In a rare case where there is another file in the directory named someinfo.img.txt this would be ignored.

您可以选择将模式更改为以$结尾,文件必须以此模式结束。在极少数情况下,名为someinfo.img.txt的目录中有另一个文件,这将被忽略。