This is what I have:
这就是我所拥有的:
glob(os.path.join('src','*.c'))
but I want to search the subfolders of src. Something like this would work:
但是我想搜索src的子文件夹。像这样的东西会起作用:
glob(os.path.join('src','*.c'))
glob(os.path.join('src','*','*.c'))
glob(os.path.join('src','*','*','*.c'))
glob(os.path.join('src','*','*','*','*.c'))
But this is obviously limited and clunky.
但这显然是有限的和笨拙的。
21 个解决方案
#1
907
Python 3.5+
Python 3.5 +
Starting with Python version 3.5, the glob
module supports the "**"
directive (which is parsed only if you pass recursive
flag):
从Python版本3.5开始,glob模块支持“**”指令(只有通过递归标记时才解析):
import glob
for filename in glob.iglob('src/**/*.c', recursive=True):
print(filename)
If you need a list, just use glob.glob
instead of glob.iglob
.
如果您需要一个列表,只需使用glob。代替glob.iglob水珠。
For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk
solution below.
对于以点(.)开头的匹配文件的情况;像当前目录中的文件或基于Unix的系统中的隐藏文件一样,使用操作系统。走下面的解决方案。
Python 2.2 to 3.4
Python 2.2到3.4
For older Python versions, starting with Python 2.2, use os.walk
to recursively walk a directory and fnmatch.filter
to match against a simple expression:
对于较老的Python版本,从Python 2.2开始,使用操作系统。走到递归地走一个目录和fnmatch。过滤器匹配一个简单的表达式:
import fnmatch
import os
matches = []
for root, dirnames, filenames in os.walk('src'):
for filename in fnmatch.filter(filenames, '*.c'):
matches.append(os.path.join(root, filename))
Python 2.1 and earlier
Python 2.1和更早的
For even older Python versions, use glob.glob
against each filename instead of fnmatch.filter
.
对于更老的Python版本,使用glob。对每个文件名使用glob,而不是fnmatch.filter。
#2
92
Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:
类似于其他的解决方案,但是使用fnmatch。fnmatch而不是glob,因为os。walk已经列出了文件名:
import os, fnmatch
def find_files(directory, pattern):
for root, dirs, files in os.walk(directory):
for basename in files:
if fnmatch.fnmatch(basename, pattern):
filename = os.path.join(root, basename)
yield filename
for filename in find_files('src', '*.c'):
print 'Found C source:', filename
Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.
另外,使用生成器可以让您处理每个文件,而不是查找所有的文件,然后处理它们。
#3
48
I've modified the glob module to support ** for recursive globbing, e.g:
我已经修改了glob模块以支持**进行递归的globbing,例如:
>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')
https://github.com/miracle2k/python-glob2/
https://github.com/miracle2k/python-glob2/
Useful when you want to provide your users with the ability to use the ** syntax, and thus os.walk() alone is not good enough.
当您想要为用户提供使用**语法的能力时,这是非常有用的,因此,仅使用walk()是不够的。
#4
38
Starting with Python 3.4, one can use the glob()
method of one of the Path
classes in the new pathlib module, which supports **
wildcards. For example:
从Python 3.4开始,您可以使用新的pathlib模块中的一个路径类的glob()方法,它支持**通配符。例如:
from pathlib import Path
for file_path in Path('src').glob('**/*.c'):
print(file_path) # do whatever you need with these files
Update: Starting with Python 3.5, the same syntax is also supported by glob.glob()
.
更新:从Python 3.5开始,glob.glob()也支持相同的语法。
#5
34
import os
import fnmatch
def recursive_glob(treeroot, pattern):
results = []
for base, dirs, files in os.walk(treeroot):
goodfiles = fnmatch.filter(files, pattern)
results.extend(os.path.join(base, f) for f in goodfiles)
return results
fnmatch
gives you exactly the same patterns as glob
, so this is really an excellent replacement for glob.glob
with very close semantics. An iterative version (e.g. a generator), IOW a replacement for glob.iglob
, is a trivial adaptation (just yield
the intermediate results as you go, instead of extend
ing a single results list to return at the end).
fnmatch给出的模式和glob完全相同,所以这是一个非常好的glob替换。具有非常密切的语义。一个迭代版本(例如生成器),IOW替换了glob。iglob,是一种简单的适应(只在你走的时候产生中间结果,而不是在最后扩展一个结果列表)。
#6
17
You'll want to use os.walk
to collect filenames that match your criteria. For example:
你会想使用操作系统。走路去收集符合你的标准的文件名。例如:
import os
cfiles = []
for root, dirs, files in os.walk('src'):
for file in files:
if file.endswith('.c'):
cfiles.append(os.path.join(root, file))
#7
11
Here's a solution with nested list comprehensions, os.walk
and simple suffix matching instead of glob
:
下面是一个包含嵌套列表理解的解决方案。走路和简单的后缀匹配而不是glob:
import os
cfiles = [os.path.join(root, filename)
for root, dirnames, filenames in os.walk('src')
for filename in filenames if filename.endswith('.c')]
It can be compressed to a one-liner:
它可以被压缩成一行:
import os;cfiles=[os.path.join(r,f) for r,d,fs in os.walk('src') for f in fs if f.endswith('.c')]
or generalized as a function:
或概括为一个函数:
import os
def recursive_glob(rootdir='.', suffix=''):
return [os.path.join(looproot, filename)
for looproot, _, filenames in os.walk(rootdir)
for filename in filenames if filename.endswith(suffix)]
cfiles = recursive_glob('src', '.c')
If you do need full glob
style patterns, you can follow Alex's and Bruno's example and use fnmatch
:
如果你需要完整的glob样式,你可以跟随Alex和Bruno的例子,使用fnmatch:
import fnmatch
import os
def recursive_glob(rootdir='.', pattern='*'):
return [os.path.join(looproot, filename)
for looproot, _, filenames in os.walk(rootdir)
for filename in filenames
if fnmatch.fnmatch(filename, pattern)]
cfiles = recursive_glob('src', '*.c')
#8
5
Johan and Bruno provide excellent solutions on the minimal requirement as stated. I have just released Formic which implements Ant FileSet and Globs which can handle this and more complicated scenarios. An implementation of your requirement is:
Johan和Bruno提供了非常好的解决方案。我刚刚发布了Formic,它实现了Ant文件集和Globs,它可以处理这个和更复杂的场景。您的需求的实现是:
import formic
fileset = formic.FileSet(include="/src/**/*.c")
for file_name in fileset.qualified_files():
print file_name
#9
5
based on other answers this is my current working implementation, which retrieves nested xml files in a root directory:
基于其他答案,这是我当前的工作实现,它在根目录中检索嵌套的xml文件:
files = []
for root, dirnames, filenames in os.walk(myDir):
files.extend(glob.glob(root + "/*.xml"))
I'm really having fun with python :)
我真的很喜欢python:)
#10
5
Recently I had to recover my pictures with the extension .jpg. I ran photorec and recovered 4579 directories 2.2 million files within, having tremendous variety of extensions.With the script below I was able to select 50133 files havin .jpg extension within minutes:
最近我不得不把我的照片和扩展名.jpg恢复。我运行了photorec,找到了4579个目录,其中有220万个文件,有各种各样的扩展。在下面的脚本中,我可以在几分钟内选择50133个文件havin .jpg扩展:
#!/usr/binenv python2.7
import glob
import shutil
import os
src_dir = "/home/mustafa/Masaüstü/yedek"
dst_dir = "/home/mustafa/Genel/media"
for mediafile in glob.iglob(os.path.join(src_dir, "*", "*.jpg")): #"*" is for subdirectory
shutil.copy(mediafile, dst_dir)
#11
3
Another way to do it using just the glob module. Just seed the rglob method with a starting base directory and a pattern to match and it will return a list of matching file names.
另一种方法是使用glob模块。只需在rglob方法中添加一个启动基目录和一个匹配的模式,它将返回匹配文件名的列表。
import glob
import os
def _getDirs(base):
return [x for x in glob.iglob(os.path.join( base, '*')) if os.path.isdir(x) ]
def rglob(base, pattern):
list = []
list.extend(glob.glob(os.path.join(base,pattern)))
dirs = _getDirs(base)
if len(dirs):
for d in dirs:
list.extend(rglob(os.path.join(base,d), pattern))
return list
#12
2
Just made this.. it will print files and directory in hierarchical way
只是做了这个. .它将以分层的方式打印文件和目录。
But I didn't used fnmatch or walk
但我没有使用过fnmatch或walk。
#!/usr/bin/python
import os,glob,sys
def dirlist(path, c = 1):
for i in glob.glob(os.path.join(path, "*")):
if os.path.isfile(i):
filepath, filename = os.path.split(i)
print '----' *c + filename
elif os.path.isdir(i):
dirname = os.path.basename(i)
print '----' *c + dirname
c+=1
dirlist(i,c)
c-=1
path = os.path.normpath(sys.argv[1])
print(os.path.basename(path))
dirlist(path)
#13
2
In addition to the suggested answers, you can do this with some lazy generation and list comprehension magic:
除了建议的答案之外,你还可以通过一些懒惰的生成和列表理解魔术来实现这一点:
import os, glob, itertools
results = itertools.chain.from_iterable(glob.iglob(os.path.join(root,'*.c'))
for root, dirs, files in os.walk('src'))
for f in results: print(f)
Besides fitting in one line and avoiding unnecessary lists in memory, this also has the nice side effect, that you can use it in a way similar to the ** operator, e.g., you could use os.path.join(root, 'some/path/*.c')
in order to get all .c files in all sub directories of src that have this structure.
除了在一个行中进行拟合和避免内存中不必要的列表之外,这还具有良好的副作用,您可以使用类似于**操作符的方式使用它,例如,您可以使用os.path。连接(root, 'some/path/*.c'),以便在具有该结构的src的所有子目录中获取所有.c文件。
#14
1
Simplified version of Johan Dahlin's answer, without fnmatch.
简化版的Johan Dahlin的答案,没有fnmatch。
import os
matches = []
for root, dirnames, filenames in os.walk('src'):
matches += [os.path.join(root, f) for f in filenames if f[-2:] == '.c']
#15
1
Or with a list comprehension:
或者有一个列表理解:
>>> base = r"c:\User\xtofl"
>>> binfiles = [ os.path.join(base,f)
for base, _, files in os.walk(root)
for f in files if f.endswith(".jpg") ]
#16
1
That one uses fnmatch or regular expression:
使用fnmatch或正则表达式:
import fnmatch, os
def filepaths(directory, pattern):
for root, dirs, files in os.walk(directory):
for basename in files:
try:
matched = pattern.match(basename)
except AttributeError:
matched = fnmatch.fnmatch(basename, pattern)
if matched:
yield os.path.join(root, basename)
# usage
if __name__ == '__main__':
from pprint import pprint as pp
import re
path = r'/Users/hipertracker/app/myapp'
pp([x for x in filepaths(path, re.compile(r'.*\.py$'))])
pp([x for x in filepaths(path, '*.py')])
#17
1
Here is my solution using list comprehension to search for multiple file extensions recursively in a directory and all subdirectories:
下面是我的解决方案,使用列表理解在目录和所有子目录中递归地搜索多个文件扩展:
import os, glob
def _globrec(path, *exts):
""" Glob recursively a directory and all subdirectories for multiple file extensions
Note: Glob is case-insensitive, i. e. for '\*.jpg' you will get files ending
with .jpg and .JPG
Parameters
----------
path : str
A directory name
exts : tuple
File extensions to glob for
Returns
-------
files : list
list of files matching extensions in exts in path and subfolders
"""
dirs = [a[0] for a in os.walk(path)]
f_filter = [d+e for d in dirs for e in exts]
return [f for files in [glob.iglob(files) for files in f_filter] for f in files]
my_pictures = _globrec(r'C:\Temp', '\*.jpg','\*.bmp','\*.png','\*.gif')
for f in my_pictures:
print f
#18
0
import sys, os, glob
dir_list = ["c:\\books\\heap"]
while len(dir_list) > 0:
cur_dir = dir_list[0]
del dir_list[0]
list_of_files = glob.glob(cur_dir+'\\*')
for book in list_of_files:
if os.path.isfile(book):
print(book)
else:
dir_list.append(book)
#19
0
I modified the top answer in this posting.. and recently created this script which will loop through all files in a given directory (searchdir) and the sub-directories under it... and prints filename, rootdir, modified/creation date, and size.
我在这个帖子里修改了上面的答案。并且最近创建了这个脚本,它将遍历给定目录(searchdir)中的所有文件和它下面的子目录。打印文件名、rootdir、修改/创建日期和大小。
Hope this helps someone... and they can walk the directory and get fileinfo.
希望这可以帮助别人……他们可以走到目录,获取文件信息。
import time
import fnmatch
import os
def fileinfo(file):
filename = os.path.basename(file)
rootdir = os.path.dirname(file)
lastmod = time.ctime(os.path.getmtime(file))
creation = time.ctime(os.path.getctime(file))
filesize = os.path.getsize(file)
print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)
searchdir = r'D:\Your\Directory\Root'
matches = []
for root, dirnames, filenames in os.walk(searchdir):
## for filename in fnmatch.filter(filenames, '*.c'):
for filename in filenames:
## matches.append(os.path.join(root, filename))
##print matches
fileinfo(os.path.join(root, filename))
#20
0
Here is a solution that will match the pattern against the full path and not just the base filename.
这里有一个解决方案,它将与整个路径匹配,而不仅仅是基本文件名。
It uses fnmatch.translate
to convert a glob-style pattern into a regular expression, which is then matched against the full path of each file found while walking the directory.
它使用:。将globstyle模式转换为正则表达式,然后在遍历目录时找到每个文件的完整路径。
re.IGNORECASE
is optional, but desirable on Windows since the file system itself is not case-sensitive. (I didn't bother compiling the regex because docs indicate it should be cached internally.)
ignorecase是可选的,但在Windows上是可取的,因为文件系统本身不区分大小写。(我没有费心编译regex,因为文档表明它应该在内部缓存。)
import fnmatch
import os
import re
def findfiles(dir, pattern):
patternregex = fnmatch.translate(pattern)
for root, dirs, files in os.walk(dir):
for basename in files:
filename = os.path.join(root, basename)
if re.search(patternregex, filename, re.IGNORECASE):
yield filename
#21
0
I needed a solution for python 2.x that works fast on large directories.
I endet up with this:
我需要一个python 2的解决方案。在大目录上快速运行的x。我喜欢这样:
import subprocess
foundfiles= subprocess.check_output("ls src/*.c src/**/*.c", shell=True)
for foundfile in foundfiles.splitlines():
print foundfile
Note that you might need some exception handling in case ls
doesn't find any matching file.
注意,如果ls没有找到匹配的文件,您可能需要一些异常处理。
#1
907
Python 3.5+
Python 3.5 +
Starting with Python version 3.5, the glob
module supports the "**"
directive (which is parsed only if you pass recursive
flag):
从Python版本3.5开始,glob模块支持“**”指令(只有通过递归标记时才解析):
import glob
for filename in glob.iglob('src/**/*.c', recursive=True):
print(filename)
If you need a list, just use glob.glob
instead of glob.iglob
.
如果您需要一个列表,只需使用glob。代替glob.iglob水珠。
For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk
solution below.
对于以点(.)开头的匹配文件的情况;像当前目录中的文件或基于Unix的系统中的隐藏文件一样,使用操作系统。走下面的解决方案。
Python 2.2 to 3.4
Python 2.2到3.4
For older Python versions, starting with Python 2.2, use os.walk
to recursively walk a directory and fnmatch.filter
to match against a simple expression:
对于较老的Python版本,从Python 2.2开始,使用操作系统。走到递归地走一个目录和fnmatch。过滤器匹配一个简单的表达式:
import fnmatch
import os
matches = []
for root, dirnames, filenames in os.walk('src'):
for filename in fnmatch.filter(filenames, '*.c'):
matches.append(os.path.join(root, filename))
Python 2.1 and earlier
Python 2.1和更早的
For even older Python versions, use glob.glob
against each filename instead of fnmatch.filter
.
对于更老的Python版本,使用glob。对每个文件名使用glob,而不是fnmatch.filter。
#2
92
Similar to other solutions, but using fnmatch.fnmatch instead of glob, since os.walk already listed the filenames:
类似于其他的解决方案,但是使用fnmatch。fnmatch而不是glob,因为os。walk已经列出了文件名:
import os, fnmatch
def find_files(directory, pattern):
for root, dirs, files in os.walk(directory):
for basename in files:
if fnmatch.fnmatch(basename, pattern):
filename = os.path.join(root, basename)
yield filename
for filename in find_files('src', '*.c'):
print 'Found C source:', filename
Also, using a generator alows you to process each file as it is found, instead of finding all the files and then processing them.
另外,使用生成器可以让您处理每个文件,而不是查找所有的文件,然后处理它们。
#3
48
I've modified the glob module to support ** for recursive globbing, e.g:
我已经修改了glob模块以支持**进行递归的globbing,例如:
>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.c')
https://github.com/miracle2k/python-glob2/
https://github.com/miracle2k/python-glob2/
Useful when you want to provide your users with the ability to use the ** syntax, and thus os.walk() alone is not good enough.
当您想要为用户提供使用**语法的能力时,这是非常有用的,因此,仅使用walk()是不够的。
#4
38
Starting with Python 3.4, one can use the glob()
method of one of the Path
classes in the new pathlib module, which supports **
wildcards. For example:
从Python 3.4开始,您可以使用新的pathlib模块中的一个路径类的glob()方法,它支持**通配符。例如:
from pathlib import Path
for file_path in Path('src').glob('**/*.c'):
print(file_path) # do whatever you need with these files
Update: Starting with Python 3.5, the same syntax is also supported by glob.glob()
.
更新:从Python 3.5开始,glob.glob()也支持相同的语法。
#5
34
import os
import fnmatch
def recursive_glob(treeroot, pattern):
results = []
for base, dirs, files in os.walk(treeroot):
goodfiles = fnmatch.filter(files, pattern)
results.extend(os.path.join(base, f) for f in goodfiles)
return results
fnmatch
gives you exactly the same patterns as glob
, so this is really an excellent replacement for glob.glob
with very close semantics. An iterative version (e.g. a generator), IOW a replacement for glob.iglob
, is a trivial adaptation (just yield
the intermediate results as you go, instead of extend
ing a single results list to return at the end).
fnmatch给出的模式和glob完全相同,所以这是一个非常好的glob替换。具有非常密切的语义。一个迭代版本(例如生成器),IOW替换了glob。iglob,是一种简单的适应(只在你走的时候产生中间结果,而不是在最后扩展一个结果列表)。
#6
17
You'll want to use os.walk
to collect filenames that match your criteria. For example:
你会想使用操作系统。走路去收集符合你的标准的文件名。例如:
import os
cfiles = []
for root, dirs, files in os.walk('src'):
for file in files:
if file.endswith('.c'):
cfiles.append(os.path.join(root, file))
#7
11
Here's a solution with nested list comprehensions, os.walk
and simple suffix matching instead of glob
:
下面是一个包含嵌套列表理解的解决方案。走路和简单的后缀匹配而不是glob:
import os
cfiles = [os.path.join(root, filename)
for root, dirnames, filenames in os.walk('src')
for filename in filenames if filename.endswith('.c')]
It can be compressed to a one-liner:
它可以被压缩成一行:
import os;cfiles=[os.path.join(r,f) for r,d,fs in os.walk('src') for f in fs if f.endswith('.c')]
or generalized as a function:
或概括为一个函数:
import os
def recursive_glob(rootdir='.', suffix=''):
return [os.path.join(looproot, filename)
for looproot, _, filenames in os.walk(rootdir)
for filename in filenames if filename.endswith(suffix)]
cfiles = recursive_glob('src', '.c')
If you do need full glob
style patterns, you can follow Alex's and Bruno's example and use fnmatch
:
如果你需要完整的glob样式,你可以跟随Alex和Bruno的例子,使用fnmatch:
import fnmatch
import os
def recursive_glob(rootdir='.', pattern='*'):
return [os.path.join(looproot, filename)
for looproot, _, filenames in os.walk(rootdir)
for filename in filenames
if fnmatch.fnmatch(filename, pattern)]
cfiles = recursive_glob('src', '*.c')
#8
5
Johan and Bruno provide excellent solutions on the minimal requirement as stated. I have just released Formic which implements Ant FileSet and Globs which can handle this and more complicated scenarios. An implementation of your requirement is:
Johan和Bruno提供了非常好的解决方案。我刚刚发布了Formic,它实现了Ant文件集和Globs,它可以处理这个和更复杂的场景。您的需求的实现是:
import formic
fileset = formic.FileSet(include="/src/**/*.c")
for file_name in fileset.qualified_files():
print file_name
#9
5
based on other answers this is my current working implementation, which retrieves nested xml files in a root directory:
基于其他答案,这是我当前的工作实现,它在根目录中检索嵌套的xml文件:
files = []
for root, dirnames, filenames in os.walk(myDir):
files.extend(glob.glob(root + "/*.xml"))
I'm really having fun with python :)
我真的很喜欢python:)
#10
5
Recently I had to recover my pictures with the extension .jpg. I ran photorec and recovered 4579 directories 2.2 million files within, having tremendous variety of extensions.With the script below I was able to select 50133 files havin .jpg extension within minutes:
最近我不得不把我的照片和扩展名.jpg恢复。我运行了photorec,找到了4579个目录,其中有220万个文件,有各种各样的扩展。在下面的脚本中,我可以在几分钟内选择50133个文件havin .jpg扩展:
#!/usr/binenv python2.7
import glob
import shutil
import os
src_dir = "/home/mustafa/Masaüstü/yedek"
dst_dir = "/home/mustafa/Genel/media"
for mediafile in glob.iglob(os.path.join(src_dir, "*", "*.jpg")): #"*" is for subdirectory
shutil.copy(mediafile, dst_dir)
#11
3
Another way to do it using just the glob module. Just seed the rglob method with a starting base directory and a pattern to match and it will return a list of matching file names.
另一种方法是使用glob模块。只需在rglob方法中添加一个启动基目录和一个匹配的模式,它将返回匹配文件名的列表。
import glob
import os
def _getDirs(base):
return [x for x in glob.iglob(os.path.join( base, '*')) if os.path.isdir(x) ]
def rglob(base, pattern):
list = []
list.extend(glob.glob(os.path.join(base,pattern)))
dirs = _getDirs(base)
if len(dirs):
for d in dirs:
list.extend(rglob(os.path.join(base,d), pattern))
return list
#12
2
Just made this.. it will print files and directory in hierarchical way
只是做了这个. .它将以分层的方式打印文件和目录。
But I didn't used fnmatch or walk
但我没有使用过fnmatch或walk。
#!/usr/bin/python
import os,glob,sys
def dirlist(path, c = 1):
for i in glob.glob(os.path.join(path, "*")):
if os.path.isfile(i):
filepath, filename = os.path.split(i)
print '----' *c + filename
elif os.path.isdir(i):
dirname = os.path.basename(i)
print '----' *c + dirname
c+=1
dirlist(i,c)
c-=1
path = os.path.normpath(sys.argv[1])
print(os.path.basename(path))
dirlist(path)
#13
2
In addition to the suggested answers, you can do this with some lazy generation and list comprehension magic:
除了建议的答案之外,你还可以通过一些懒惰的生成和列表理解魔术来实现这一点:
import os, glob, itertools
results = itertools.chain.from_iterable(glob.iglob(os.path.join(root,'*.c'))
for root, dirs, files in os.walk('src'))
for f in results: print(f)
Besides fitting in one line and avoiding unnecessary lists in memory, this also has the nice side effect, that you can use it in a way similar to the ** operator, e.g., you could use os.path.join(root, 'some/path/*.c')
in order to get all .c files in all sub directories of src that have this structure.
除了在一个行中进行拟合和避免内存中不必要的列表之外,这还具有良好的副作用,您可以使用类似于**操作符的方式使用它,例如,您可以使用os.path。连接(root, 'some/path/*.c'),以便在具有该结构的src的所有子目录中获取所有.c文件。
#14
1
Simplified version of Johan Dahlin's answer, without fnmatch.
简化版的Johan Dahlin的答案,没有fnmatch。
import os
matches = []
for root, dirnames, filenames in os.walk('src'):
matches += [os.path.join(root, f) for f in filenames if f[-2:] == '.c']
#15
1
Or with a list comprehension:
或者有一个列表理解:
>>> base = r"c:\User\xtofl"
>>> binfiles = [ os.path.join(base,f)
for base, _, files in os.walk(root)
for f in files if f.endswith(".jpg") ]
#16
1
That one uses fnmatch or regular expression:
使用fnmatch或正则表达式:
import fnmatch, os
def filepaths(directory, pattern):
for root, dirs, files in os.walk(directory):
for basename in files:
try:
matched = pattern.match(basename)
except AttributeError:
matched = fnmatch.fnmatch(basename, pattern)
if matched:
yield os.path.join(root, basename)
# usage
if __name__ == '__main__':
from pprint import pprint as pp
import re
path = r'/Users/hipertracker/app/myapp'
pp([x for x in filepaths(path, re.compile(r'.*\.py$'))])
pp([x for x in filepaths(path, '*.py')])
#17
1
Here is my solution using list comprehension to search for multiple file extensions recursively in a directory and all subdirectories:
下面是我的解决方案,使用列表理解在目录和所有子目录中递归地搜索多个文件扩展:
import os, glob
def _globrec(path, *exts):
""" Glob recursively a directory and all subdirectories for multiple file extensions
Note: Glob is case-insensitive, i. e. for '\*.jpg' you will get files ending
with .jpg and .JPG
Parameters
----------
path : str
A directory name
exts : tuple
File extensions to glob for
Returns
-------
files : list
list of files matching extensions in exts in path and subfolders
"""
dirs = [a[0] for a in os.walk(path)]
f_filter = [d+e for d in dirs for e in exts]
return [f for files in [glob.iglob(files) for files in f_filter] for f in files]
my_pictures = _globrec(r'C:\Temp', '\*.jpg','\*.bmp','\*.png','\*.gif')
for f in my_pictures:
print f
#18
0
import sys, os, glob
dir_list = ["c:\\books\\heap"]
while len(dir_list) > 0:
cur_dir = dir_list[0]
del dir_list[0]
list_of_files = glob.glob(cur_dir+'\\*')
for book in list_of_files:
if os.path.isfile(book):
print(book)
else:
dir_list.append(book)
#19
0
I modified the top answer in this posting.. and recently created this script which will loop through all files in a given directory (searchdir) and the sub-directories under it... and prints filename, rootdir, modified/creation date, and size.
我在这个帖子里修改了上面的答案。并且最近创建了这个脚本,它将遍历给定目录(searchdir)中的所有文件和它下面的子目录。打印文件名、rootdir、修改/创建日期和大小。
Hope this helps someone... and they can walk the directory and get fileinfo.
希望这可以帮助别人……他们可以走到目录,获取文件信息。
import time
import fnmatch
import os
def fileinfo(file):
filename = os.path.basename(file)
rootdir = os.path.dirname(file)
lastmod = time.ctime(os.path.getmtime(file))
creation = time.ctime(os.path.getctime(file))
filesize = os.path.getsize(file)
print "%s**\t%s\t%s\t%s\t%s" % (rootdir, filename, lastmod, creation, filesize)
searchdir = r'D:\Your\Directory\Root'
matches = []
for root, dirnames, filenames in os.walk(searchdir):
## for filename in fnmatch.filter(filenames, '*.c'):
for filename in filenames:
## matches.append(os.path.join(root, filename))
##print matches
fileinfo(os.path.join(root, filename))
#20
0
Here is a solution that will match the pattern against the full path and not just the base filename.
这里有一个解决方案,它将与整个路径匹配,而不仅仅是基本文件名。
It uses fnmatch.translate
to convert a glob-style pattern into a regular expression, which is then matched against the full path of each file found while walking the directory.
它使用:。将globstyle模式转换为正则表达式,然后在遍历目录时找到每个文件的完整路径。
re.IGNORECASE
is optional, but desirable on Windows since the file system itself is not case-sensitive. (I didn't bother compiling the regex because docs indicate it should be cached internally.)
ignorecase是可选的,但在Windows上是可取的,因为文件系统本身不区分大小写。(我没有费心编译regex,因为文档表明它应该在内部缓存。)
import fnmatch
import os
import re
def findfiles(dir, pattern):
patternregex = fnmatch.translate(pattern)
for root, dirs, files in os.walk(dir):
for basename in files:
filename = os.path.join(root, basename)
if re.search(patternregex, filename, re.IGNORECASE):
yield filename
#21
0
I needed a solution for python 2.x that works fast on large directories.
I endet up with this:
我需要一个python 2的解决方案。在大目录上快速运行的x。我喜欢这样:
import subprocess
foundfiles= subprocess.check_output("ls src/*.c src/**/*.c", shell=True)
for foundfile in foundfiles.splitlines():
print foundfile
Note that you might need some exception handling in case ls
doesn't find any matching file.
注意,如果ls没有找到匹配的文件,您可能需要一些异常处理。