用Python替换文件中的字符串

时间:2021-10-29 18:22:58

How can you replace the match with the given replacement recursively in a given directory and its subdirectories?

如何在给定的目录及其子目录中递归地替换与给定替换的匹配?

Pseudo-code

import os
import re
from os.path import walk
for root, dirs, files in os.walk("/home/noa/Desktop/codes"):
        for name in dirs:
                re.search("dbname=noa user=noa", "dbname=masi user=masi")
                   // I am trying to replace here a given match in a file

5 个解决方案

#1


23  

Put all this code into a file called mass_replace. Under Linux or Mac OS X, you can do chmod +x mass_replace and then just run this. Under Windows, you can run it with python mass_replace followed by the appropriate arguments.

将所有这些代码放入名为mass_replace的文件中。在Linux或Mac OS X下,可以执行chmod + X mass_replace,然后运行这个。在Windows下,您可以使用python mass_replace和适当的参数来运行它。

#!/usr/bin/python

import os
import re
import sys

# list of extensions to replace
DEFAULT_REPLACE_EXTENSIONS = None
# example: uncomment next line to only replace *.c, *.h, and/or *.txt
# DEFAULT_REPLACE_EXTENSIONS = (".c", ".h", ".txt")

def try_to_replace(fname, replace_extensions=DEFAULT_REPLACE_EXTENSIONS):
    if replace_extensions:
        return fname.lower().endswith(replace_extensions)
    return True


def file_replace(fname, pat, s_after):
    # first, see if the pattern is even in the file.
    with open(fname) as f:
        if not any(re.search(pat, line) for line in f):
            return # pattern does not occur in file so we are done.

    # pattern is in the file, so perform replace operation.
    with open(fname) as f:
        out_fname = fname + ".tmp"
        out = open(out_fname, "w")
        for line in f:
            out.write(re.sub(pat, s_after, line))
        out.close()
        os.rename(out_fname, fname)


def mass_replace(dir_name, s_before, s_after, replace_extensions=DEFAULT_REPLACE_EXTENSIONS):
    pat = re.compile(s_before)
    for dirpath, dirnames, filenames in os.walk(dir_name):
        for fname in filenames:
            if try_to_replace(fname, replace_extensions):
                fullname = os.path.join(dirpath, fname)
                file_replace(fullname, pat, s_after)

if len(sys.argv) != 4:
    u = "Usage: mass_replace <dir_name> <string_before> <string_after>\n"
    sys.stderr.write(u)
    sys.exit(1)

mass_replace(sys.argv[1], sys.argv[2], sys.argv[3])

EDIT: I have changed the above code from the original answer. There are several changes. First, mass_replace() now calls re.compile() to pre-compile the search pattern; second, to check what extension the file has, we now pass in a tuple of file extensions to .endswith() rather than calling .endswith() three times; third, it now uses the with statement available in recent versions of Python; and finally, file_replace() now checks to see if the pattern is found within the file, and doesn't rewrite the file if the pattern is not found. (The old version would rewrite every file, changing the timestamps even if the output file was identical to the input file; this was inelegant.)

编辑:我把上面的代码从原来的答案修改了。有几个变化。首先,mass_replace()现在调用re.compile()来预编译搜索模式;其次,为了检查文件有什么扩展名,我们现在向.endswith()传递一组文件扩展名,而不是三次调用.endswith();第三,它现在使用Python最新版本中可用的with语句;最后,file_replace()现在检查是否在文件中找到了模式,如果没有找到模式,则不重写文件。(旧版本会重写每个文件,即使输出文件与输入文件相同,也会更改时间戳;这是不雅的。)

EDIT: I changed this to default to replacing every file, but with one line you can edit to limit it to particular extensions. I think replacing every file is a more useful out-of-the-box default. This could be extended with a list of extensions or filenames not to touch, options to make it case insensitive, etc.

编辑:我将其修改为默认为替换每个文件,但是可以使用一行进行编辑,将其限制为特定的扩展名。我认为替换每个文件是一种更有用的开箱即用的默认方式。这可以通过扩展名或文件名列表进行扩展,可以选择使其不区分大小写等等。

EDIT: In a comment, @asciimo pointed out a bug. I edited this to fix the bug. str.endswith() is documented to accept a tuple of strings to try, but not a list. Fixed. Also, I made a couple of the functions accept an optional argument to let you pass in a tuple of extensions; it should be pretty easy to modify this to accept a command-line argument to specify which extensions.

编辑:在评论中,@asciimo指出一个bug。我编辑它来修复这个bug。string .endswith()被文档化以接受要尝试的字符串元组,而不是列表。固定的。另外,我让两个函数接受一个可选参数,让您传入一个扩展元组;修改它以接受命令行参数来指定哪个扩展应该非常容易。

#2


9  

Do you really need regular expressions?

你真的需要正则表达式吗?

import os

def recursive_replace( root, pattern, replace )
    for dir, subdirs, names in os.walk( root ):
        for name in names:
            path = os.path.join( dir, name )
            text = open( path ).read()
            if pattern in text:
                open( path, 'w' ).write( text.replace( pattern, replace ) )

#3


3  

Of course, if you just want to get it done without coding it up, use find and xargs:

当然,如果你只是想完成它而不需要编码,可以使用find和xargs:

find /home/noa/Desktop/codes -type f -print0 | \
xargs -0 sed --in-place "s/dbname=noa user=noa/dbname=masi user=masi"

(And you could likely do this with find's -exec or something as well, but I prefer xargs.)

(你也可以用find的-exec或其他方法来做这个,但我更喜欢xargs。)

#4


2  

This is how I would find and replace strings in files using python. This is a simple little function that will recursively search a directories for a string and replace it with a string. You can also limit files with a certain file extension like the example below.

这就是我如何使用python查找和替换文件中的字符串。这是一个简单的小函数,它将递归地搜索一个字符串的目录,并用一个字符串替换它。您还可以使用特定的文件扩展来限制文件,例如下面的示例。

import os, fnmatch
def findReplace(directory, find, replace, filePattern):
    for path, dirs, files in os.walk(os.path.abspath(directory)):
        for filename in fnmatch.filter(files, filePattern):
            filepath = os.path.join(path, filename)
            with open(filepath) as f:
                s = f.read()
            s = s.replace(find, replace)
            with open(filepath, "w") as f:
                f.write(s)

This allows you to do something like:

这允许你做以下事情:

findReplace("some_dir", "find this", "replace with this", "*.txt")

#5


2  

this should work:

这应该工作:

import re, os
import fnmatch
for path, dirs, files in os.walk(os.path.abspath(directory)):
       for filename in fnmatch.filter(files, filePattern):
           filepath = os.path.join(path, filename)
           with open("namelist.wps", 'a') as out:
               with open("namelist.wps", 'r') as readf:
                   for line in readf:
                       line = re.sub(r"dbname=noa user=noa", "dbname=masi user=masi", line)
                       out.write(line)

#1


23  

Put all this code into a file called mass_replace. Under Linux or Mac OS X, you can do chmod +x mass_replace and then just run this. Under Windows, you can run it with python mass_replace followed by the appropriate arguments.

将所有这些代码放入名为mass_replace的文件中。在Linux或Mac OS X下,可以执行chmod + X mass_replace,然后运行这个。在Windows下,您可以使用python mass_replace和适当的参数来运行它。

#!/usr/bin/python

import os
import re
import sys

# list of extensions to replace
DEFAULT_REPLACE_EXTENSIONS = None
# example: uncomment next line to only replace *.c, *.h, and/or *.txt
# DEFAULT_REPLACE_EXTENSIONS = (".c", ".h", ".txt")

def try_to_replace(fname, replace_extensions=DEFAULT_REPLACE_EXTENSIONS):
    if replace_extensions:
        return fname.lower().endswith(replace_extensions)
    return True


def file_replace(fname, pat, s_after):
    # first, see if the pattern is even in the file.
    with open(fname) as f:
        if not any(re.search(pat, line) for line in f):
            return # pattern does not occur in file so we are done.

    # pattern is in the file, so perform replace operation.
    with open(fname) as f:
        out_fname = fname + ".tmp"
        out = open(out_fname, "w")
        for line in f:
            out.write(re.sub(pat, s_after, line))
        out.close()
        os.rename(out_fname, fname)


def mass_replace(dir_name, s_before, s_after, replace_extensions=DEFAULT_REPLACE_EXTENSIONS):
    pat = re.compile(s_before)
    for dirpath, dirnames, filenames in os.walk(dir_name):
        for fname in filenames:
            if try_to_replace(fname, replace_extensions):
                fullname = os.path.join(dirpath, fname)
                file_replace(fullname, pat, s_after)

if len(sys.argv) != 4:
    u = "Usage: mass_replace <dir_name> <string_before> <string_after>\n"
    sys.stderr.write(u)
    sys.exit(1)

mass_replace(sys.argv[1], sys.argv[2], sys.argv[3])

EDIT: I have changed the above code from the original answer. There are several changes. First, mass_replace() now calls re.compile() to pre-compile the search pattern; second, to check what extension the file has, we now pass in a tuple of file extensions to .endswith() rather than calling .endswith() three times; third, it now uses the with statement available in recent versions of Python; and finally, file_replace() now checks to see if the pattern is found within the file, and doesn't rewrite the file if the pattern is not found. (The old version would rewrite every file, changing the timestamps even if the output file was identical to the input file; this was inelegant.)

编辑:我把上面的代码从原来的答案修改了。有几个变化。首先,mass_replace()现在调用re.compile()来预编译搜索模式;其次,为了检查文件有什么扩展名,我们现在向.endswith()传递一组文件扩展名,而不是三次调用.endswith();第三,它现在使用Python最新版本中可用的with语句;最后,file_replace()现在检查是否在文件中找到了模式,如果没有找到模式,则不重写文件。(旧版本会重写每个文件,即使输出文件与输入文件相同,也会更改时间戳;这是不雅的。)

EDIT: I changed this to default to replacing every file, but with one line you can edit to limit it to particular extensions. I think replacing every file is a more useful out-of-the-box default. This could be extended with a list of extensions or filenames not to touch, options to make it case insensitive, etc.

编辑:我将其修改为默认为替换每个文件,但是可以使用一行进行编辑,将其限制为特定的扩展名。我认为替换每个文件是一种更有用的开箱即用的默认方式。这可以通过扩展名或文件名列表进行扩展,可以选择使其不区分大小写等等。

EDIT: In a comment, @asciimo pointed out a bug. I edited this to fix the bug. str.endswith() is documented to accept a tuple of strings to try, but not a list. Fixed. Also, I made a couple of the functions accept an optional argument to let you pass in a tuple of extensions; it should be pretty easy to modify this to accept a command-line argument to specify which extensions.

编辑:在评论中,@asciimo指出一个bug。我编辑它来修复这个bug。string .endswith()被文档化以接受要尝试的字符串元组,而不是列表。固定的。另外,我让两个函数接受一个可选参数,让您传入一个扩展元组;修改它以接受命令行参数来指定哪个扩展应该非常容易。

#2


9  

Do you really need regular expressions?

你真的需要正则表达式吗?

import os

def recursive_replace( root, pattern, replace )
    for dir, subdirs, names in os.walk( root ):
        for name in names:
            path = os.path.join( dir, name )
            text = open( path ).read()
            if pattern in text:
                open( path, 'w' ).write( text.replace( pattern, replace ) )

#3


3  

Of course, if you just want to get it done without coding it up, use find and xargs:

当然,如果你只是想完成它而不需要编码,可以使用find和xargs:

find /home/noa/Desktop/codes -type f -print0 | \
xargs -0 sed --in-place "s/dbname=noa user=noa/dbname=masi user=masi"

(And you could likely do this with find's -exec or something as well, but I prefer xargs.)

(你也可以用find的-exec或其他方法来做这个,但我更喜欢xargs。)

#4


2  

This is how I would find and replace strings in files using python. This is a simple little function that will recursively search a directories for a string and replace it with a string. You can also limit files with a certain file extension like the example below.

这就是我如何使用python查找和替换文件中的字符串。这是一个简单的小函数,它将递归地搜索一个字符串的目录,并用一个字符串替换它。您还可以使用特定的文件扩展来限制文件,例如下面的示例。

import os, fnmatch
def findReplace(directory, find, replace, filePattern):
    for path, dirs, files in os.walk(os.path.abspath(directory)):
        for filename in fnmatch.filter(files, filePattern):
            filepath = os.path.join(path, filename)
            with open(filepath) as f:
                s = f.read()
            s = s.replace(find, replace)
            with open(filepath, "w") as f:
                f.write(s)

This allows you to do something like:

这允许你做以下事情:

findReplace("some_dir", "find this", "replace with this", "*.txt")

#5


2  

this should work:

这应该工作:

import re, os
import fnmatch
for path, dirs, files in os.walk(os.path.abspath(directory)):
       for filename in fnmatch.filter(files, filePattern):
           filepath = os.path.join(path, filename)
           with open("namelist.wps", 'a') as out:
               with open("namelist.wps", 'r') as readf:
                   for line in readf:
                       line = re.sub(r"dbname=noa user=noa", "dbname=masi user=masi", line)
                       out.write(line)