== os.path 模块 == ``os.path`` 模块包含了各种处理长文件名(路径名)的函数. 先导入 (import) ``os`` 模块, 然后就可以以 ``os.path`` 访问该模块. === 处理文件名=== ``os.path`` 模块包含了许多与平台无关的处理长文件名的函数. 也就是说, 你不需要处理前后斜杠, 冒号等. 我们可以看看 [Example 1-42 #eg-1-42] 中的样例代码. ====Example 1-42. 使用 os.path 模块处理文件名====[eg-1-42] ``` File: os-path-example-1.py import os filename = "my/little/pony" print "using", os.name, "..." print "split", "=>", os.path.split(filename) print "splitext", "=>", os.path.splitext(filename) print "dirname", "=>", os.path.dirname(filename) print "basename", "=>", os.path.basename(filename) print "join", "=>", os.path.join(os.path.dirname(filename), os.path.basename(filename)) *B*using nt ... split => ('my/little', 'pony') splitext => ('my/little/pony', '') dirname => my/little basename => pony join => my/little\pony*b* ``` 注意这里的 ``split`` 只分割出最后一项(不带斜杠). ``os.path`` 模块中还有许多函数允许你简单快速地获知文件名的一些特征,如 [Example 1-43 #eg-1-43] 所示。 ====Example 1-43. 使用 os.path 模块检查文件名的特征====[eg-1-43] ``` File: os-path-example-2.py import os FILES = ( os.curdir, "/", "file", "/file", "samples", "samples/sample.jpg", "directory/file", "../directory/file", "/directory/file" ) for file in FILES: print file, "=>", if os.path.exists(file): print "EXISTS", if os.path.isabs(file): print "ISABS", if os.path.isdir(file): print "ISDIR", if os.path.isfile(file): print "ISFILE", if os.path.islink(file): print "ISLINK", if os.path.ismount(file): print "ISMOUNT", print *B*. => EXISTS ISDIR / => EXISTS ISABS ISDIR ISMOUNT file => /file => ISABS samples => EXISTS ISDIR samples/sample.jpg => EXISTS ISFILE directory/file => ../directory/file => /directory/file => ISABS*b* ``` ``expanduser`` 函数以与大部分Unix shell相同的方式处理用户名快捷符号(~, 不过在 Windows 下工作不正常), 如 [Example 1-44 #eg-1-44] 所示. ====Example 1-44. 使用 os.path 模块将用户名插入到文件名====[eg-1-44] ``` File: os-path-expanduser-example-1.py import os print os.path.expanduser("~/.pythonrc") # /home/effbot/.pythonrc ``` ``expandvars`` 函数将文件名中的环境变量替换为对应值, 如 [Example 1-45 #eg-1-45] 所示. ====Example 1-45. 使用 os.path 替换文件名中的环境变量====[eg-1-45] ``` File: os-path-expandvars-example-1.py import os os.environ["USER"] = "user" print os.path.expandvars("/home/$USER/config") print os.path.expandvars("$USER/folders") *B*/home/user/config user/folders*b* ``` === 搜索文件系统=== ``walk`` 函数会帮你找出一个目录树下的所有文件 (如 [Example 1-46 #eg-1-46] 所示). 它的参数依次是目录名, 回调函数, 以及传递给回调函数的数据对象. ====Example 1-46. 使用 os.path 搜索文件系统====[eg-1-46] ``` File: os-path-walk-example-1.py import os def callback(arg, directory, files): for file in files: print os.path.join(directory, file), repr(arg) os.path.walk(".", callback, "secret message") *B*./aifc-example-1.py 'secret message' ./anydbm-example-1.py 'secret message' ./array-example-1.py 'secret message' ... ./samples 'secret message' ./samples/sample.jpg 'secret message' ./samples/sample.txt 'secret message' ./samples/sample.zip 'secret message' ./samples/articles 'secret message' ./samples/articles/article-1.txt 'secret message' ./samples/articles/article-2.txt 'secret message' ...*b* ``` ``walk`` 函数的接口多少有点晦涩 (也许只是对我个人而言, 我总是记不住参数的顺序). [Example 1-47 #eg-1-47] 中展示的 ``index`` 函数会返回一个文件名列表, 你可以直接使用 ``for-in`` 循环处理文件. ====Example 1-47. 使用 os.listdir 搜索文件系统====[eg-1-47] ``` File: os-path-walk-example-2.py import os def index(directory): # like os.listdir, but traverses directory trees stack = [directory] files = [] while stack: directory = stack.pop() for file in os.listdir(directory): fullname = os.path.join(directory, file) files.append(fullname) if os.path.isdir(fullname) and not os.path.islink(fullname): stack.append(fullname) return files for file in index("."): print file *B*.\aifc-example-1.py .\anydbm-example-1.py .\array-example-1.py ...*b* ``` 如果你不想列出所有的文件 (基于性能或者是内存的考虑) , [Example 1-48 #eg-1-48] 展示了另一种方法. 这里 //DirectoryWalker// 类的行为与序列对象相似, 一次返回一个文件. (generator?) ====Example 1-48. 使用 DirectoryWalker 搜索文件系统====[eg-1-48] ``` File: os-path-walk-example-3.py import os class DirectoryWalker: # a forward iterator that traverses a directory tree def _ _init_ _(self, directory): self.stack = [directory] self.files = [] self.index = 0 def _ _getitem_ _(self, index): while 1: try: file = self.files[self.index] self.index = self.index + 1 except IndexError: # pop next directory from stack self.directory = self.stack.pop() self.files = os.listdir(self.directory) self.index = 0 else: # got a filename fullname = os.path.join(self.directory, file) if os.path.isdir(fullname) and not os.path.islink(fullname): self.stack.append(fullname) return fullname for file in DirectoryWalker("."): print file *B*.\aifc-example-1.py .\anydbm-example-1.py .\array-example-1.py ...*b* ``` 注意 //DirectoryWalker// 类并不检查传递给 ``_ _getitem_ _`` 方法的索引值. 这意味着如果你越界访问序列成员(索引数字过大)的话, 这个类将不能正常工作. 最后, 如果你需要处理文件大小和时间戳, [Example 1-49 #eg-1-49] 给出了一个类, 它返回文件名和它的 ``os.stat`` 属性(一个元组). 这个版本在每个文件上都能节省一次或两次 ``stat`` 调用( ``os.path.isdir`` 和 ``os.path.islink`` 内部都使用了 ``stat`` ), 并且在一些平台上运行很快. ====Example 1-49. 使用 DirectoryStatWalker 搜索文件系统====[eg-1-49] ``` File: os-path-walk-example-4.py import os, stat class DirectoryStatWalker: # a forward iterator that traverses a directory tree, and # returns the filename and additional file information def _ _init_ _(self, directory): self.stack = [directory] self.files = [] self.index = 0 def _ _getitem_ _(self, index): while 1: try: file = self.files[self.index] self.index = self.index + 1 except IndexError: # pop next directory from stack self.directory = self.stack.pop() self.files = os.listdir(self.directory) self.index = 0 else: # got a filename fullname = os.path.join(self.directory, file) st = os.stat(fullname) mode = st[stat.ST_MODE] if stat.S_ISDIR(mode) and not stat.S_ISLNK(mode): self.stack.append(fullname) return fullname, st for file, st in DirectoryStatWalker("."): print file, st[stat.ST_SIZE] *B*.\aifc-example-1.py 336 .\anydbm-example-1.py 244 .\array-example-1.py 526*b* ```