搜索一个文件夹和它的所有子文件夹,查找特定类型的文件

时间:2023-01-14 09:17:14

I am trying to search for all files of a given type (say .pdf) in a given folder and copy them to a new folder. What I need to be able to do is to specify a root folder and search through that folder and all of its subfolders for any files that match the given type (.pdf). Can anyone give me a hand on how I should search through the root folder's subfolders and their subfolders and so on. It sounds like a recursive method would do the trick here, but I cannot implement one correctly? (I am implementing this program in ruby by the way).

我试图在一个给定的文件夹中搜索给定类型的所有文件(比如.pdf),并将它们复制到一个新的文件夹中。我需要做的是指定一个根文件夹,并在该文件夹及其所有子文件夹中搜索与给定类型匹配的任何文件(.pdf)。谁能帮我查一下根文件夹的子文件夹和子文件夹等等。这听起来像是递归方法,但我不能正确地实现它?(顺便说一下,我正在用ruby实现这个程序)。

4 个解决方案

#1


55  

You want the Find module. Find.find takes a string containing a path, and will pass the parent path along with the path of each file and sub-directory to an accompanying block. Some example code:

需要查找模块。找到。find采用包含路径的字符串,并将父路径连同每个文件和子目录的路径传递到相应的块。一些示例代码:

require 'find'

pdf_file_paths = []
Find.find('path/to/search') do |path|
  pdf_file_paths << path if path =~ /.*\.pdf$/
end

That will recursively search a path, and store all file names ending in .pdf in an array.

这将递归地搜索路径,并将以.pdf结尾的所有文件名存储在一个数组中。

#2


89  

Try this:

试试这个:

Dir.glob("#{folder}/**/*.pdf")

which is the same as

哪个和

Dir["#{folder}/**/*.pdf"]

Where the folder variable is the path to the root folder you want to search through.

文件夹变量是要搜索的根文件夹的路径。

#3


18  

If speed is a concern, prefer Dir.glob over Find.find.

如果需要考虑速度,请选择Dir。在Find.find水珠。

Warming up --------------------------------------
           Find.find   124.000  i/100ms
            Dir.glob   515.000  i/100ms
Calculating -------------------------------------
           Find.find      1.242k (± 4.7%) i/s -      6.200k in   5.001398s
            Dir.glob      5.249k (± 4.5%) i/s -     26.265k in   5.014632s

Comparison:
            Dir.glob:     5248.5 i/s
           Find.find:     1242.4 i/s - 4.22x slower

 

 

require 'find'
require 'benchmark/ips'

dir = '.'

Benchmark.ips do |x|
  x.report 'Find.find' do
    Find.find(dir).select { |f| f =~ /\*\.pdf/ }
  end

  x.report 'Dir.glob' do
    Dir.glob("#{dir}/**/*\.pdf")
  end

  x.compare!
end

Using ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin15]

使用ruby 2.2.2p95(2015-04-13修订50295)[x86_64-达尔文主义]

#4


9  

As a small improvement to Jergason and Matt's answer above, here's how you can condense to a single line:

作为对Jergason和Matt的回答的一个小小的改进,以下是如何压缩成一行:

pdf_file_paths = Find.find('path/to/search').select { |p| /.*\.pdf$/ =~ p }

This uses the Find method as above, but leverages the fact that the result is an enumerable (and as such we can use select) to get an array back with the set of matches

这使用了上面的Find方法,但是利用了一个事实,即结果是一个可枚举的(并且我们可以使用select)将数组返回到匹配的集合中。

#1


55  

You want the Find module. Find.find takes a string containing a path, and will pass the parent path along with the path of each file and sub-directory to an accompanying block. Some example code:

需要查找模块。找到。find采用包含路径的字符串,并将父路径连同每个文件和子目录的路径传递到相应的块。一些示例代码:

require 'find'

pdf_file_paths = []
Find.find('path/to/search') do |path|
  pdf_file_paths << path if path =~ /.*\.pdf$/
end

That will recursively search a path, and store all file names ending in .pdf in an array.

这将递归地搜索路径,并将以.pdf结尾的所有文件名存储在一个数组中。

#2


89  

Try this:

试试这个:

Dir.glob("#{folder}/**/*.pdf")

which is the same as

哪个和

Dir["#{folder}/**/*.pdf"]

Where the folder variable is the path to the root folder you want to search through.

文件夹变量是要搜索的根文件夹的路径。

#3


18  

If speed is a concern, prefer Dir.glob over Find.find.

如果需要考虑速度,请选择Dir。在Find.find水珠。

Warming up --------------------------------------
           Find.find   124.000  i/100ms
            Dir.glob   515.000  i/100ms
Calculating -------------------------------------
           Find.find      1.242k (± 4.7%) i/s -      6.200k in   5.001398s
            Dir.glob      5.249k (± 4.5%) i/s -     26.265k in   5.014632s

Comparison:
            Dir.glob:     5248.5 i/s
           Find.find:     1242.4 i/s - 4.22x slower

 

 

require 'find'
require 'benchmark/ips'

dir = '.'

Benchmark.ips do |x|
  x.report 'Find.find' do
    Find.find(dir).select { |f| f =~ /\*\.pdf/ }
  end

  x.report 'Dir.glob' do
    Dir.glob("#{dir}/**/*\.pdf")
  end

  x.compare!
end

Using ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin15]

使用ruby 2.2.2p95(2015-04-13修订50295)[x86_64-达尔文主义]

#4


9  

As a small improvement to Jergason and Matt's answer above, here's how you can condense to a single line:

作为对Jergason和Matt的回答的一个小小的改进,以下是如何压缩成一行:

pdf_file_paths = Find.find('path/to/search').select { |p| /.*\.pdf$/ =~ p }

This uses the Find method as above, but leverages the fact that the result is an enumerable (and as such we can use select) to get an array back with the set of matches

这使用了上面的Find方法,但是利用了一个事实,即结果是一个可枚举的(并且我们可以使用select)将数组返回到匹配的集合中。