Scala - 如何只遍历与目录中特定字符串匹配的文件?

时间:2022-02-17 19:24:24

I have a directory with files that look like part-00000, part-00001, etc. There are also other files that I do not want to iterate through, so I would want to do some form of pattern matching/regex/filtering on file names that start with "part-".

我有一个目录,文件看起来像part-00000,part-00001等。还有其他文件,我不想迭代,所以我想做一些形式的模式匹配/正则表达式/文件过滤以“part-”开头的名称。

How do I iterate through only the files that start with "part-"?

如何仅遍历以“part-”开头的文件?

4 个解决方案

#1


0  

  1. You can use this regex part-.* for example (demo)
  2. 您可以使用此正则表达式部分 - 。*例如(演示)

  3. If you the rest part contain only numbers, then you can use part-\d* (demo)
  4. 如果你的其余部分只包含数字,那么你可以使用part- \ d *(demo)

  5. If you want to march only part- followed by 5 numbers part-\d{5,5} (demo)
  6. 如果你想只游行部分 - 然后是5个数字部分 - \ d {5,5}(演示)

#2


0  

Provided that you already have the list of files:

前提是您已经拥有文件列表:

object Test {
   def main(args: Array[String]) {
       val listOfFiles = List("part-00000", "part-00001", "randomFile", "part-00003", "randomFile2", "part-00004")
       val prefix = "part-"

       listOfFiles.filter(_.startsWith(prefix)).map(println)
   }
}

We take the list and first apply a filter and then map each element. You can add whatever logic you want inside the map.

我们获取列表并首先应用过滤器,然后映射每个元素。您可以在地图中添加所需的任何逻辑。

#3


0  

You could use filter:

你可以使用过滤器:

new File("c:/sequence-files/").listFiles.filter(_.getName.startsWith("part-")).foreach(println)

#4


0  

You can define a function like this:

您可以定义这样的函数:

def listFiles(file: File, pattern: String): Array[File] = {
  val files = file.listFiles()
  val regex = pattern.r
  files
  .filter(f => f.isFile() && regex.findFirstIn(file.getName).isDefined)
  .toArray
}

And call it with directory and pattern. As you want all the files starting with part-, the pattern would be part-*. Below is the example call

并使用目录和模式调用它。如果你想要所有以part-开头的文件,那么模式将是part- *。以下是示例调用

val files = listFiles(new File("path), "part-*")

#1


0  

  1. You can use this regex part-.* for example (demo)
  2. 您可以使用此正则表达式部分 - 。*例如(演示)

  3. If you the rest part contain only numbers, then you can use part-\d* (demo)
  4. 如果你的其余部分只包含数字,那么你可以使用part- \ d *(demo)

  5. If you want to march only part- followed by 5 numbers part-\d{5,5} (demo)
  6. 如果你想只游行部分 - 然后是5个数字部分 - \ d {5,5}(演示)

#2


0  

Provided that you already have the list of files:

前提是您已经拥有文件列表:

object Test {
   def main(args: Array[String]) {
       val listOfFiles = List("part-00000", "part-00001", "randomFile", "part-00003", "randomFile2", "part-00004")
       val prefix = "part-"

       listOfFiles.filter(_.startsWith(prefix)).map(println)
   }
}

We take the list and first apply a filter and then map each element. You can add whatever logic you want inside the map.

我们获取列表并首先应用过滤器,然后映射每个元素。您可以在地图中添加所需的任何逻辑。

#3


0  

You could use filter:

你可以使用过滤器:

new File("c:/sequence-files/").listFiles.filter(_.getName.startsWith("part-")).foreach(println)

#4


0  

You can define a function like this:

您可以定义这样的函数:

def listFiles(file: File, pattern: String): Array[File] = {
  val files = file.listFiles()
  val regex = pattern.r
  files
  .filter(f => f.isFile() && regex.findFirstIn(file.getName).isDefined)
  .toArray
}

And call it with directory and pattern. As you want all the files starting with part-, the pattern would be part-*. Below is the example call

并使用目录和模式调用它。如果你想要所有以part-开头的文件,那么模式将是part- *。以下是示例调用

val files = listFiles(new File("path), "part-*")