为什么使用re.match（），当re.search（）可以做同样的事情？

From the documentation, it's very clear that:

从文档中可以清楚地看出：

match() -> apply pattern match at the beginning of the string
match（） - >在字符串的开头应用模式匹配
search() -> search through the string and return first match
search（） - >搜索字符串并返回第一个匹配项

And search with '^' and without re.M flag would work the same as match.

并使用'^'搜索并且没有re.M标记将与匹配相同。

Then why does python have match()? Isn't it redundant? Are there any performance benefits to keeping match() in python?

那为什么python有match（）？这不是多余的吗？在python中保持match（）有什么性能优势？

2 个解决方案

#1

"Why" questions are hard to answer. As a matter of fact, you could define the function re.match() like this:

“为什么”这些问题很难回答。事实上，您可以像这样定义函数re.match（）：

def match(pattern, string, flags):
    return re.search(r"\A(?:" + pattern + ")", string, flags)

(because \A always matches at the start of the string, regardless of the re.M flag status´).

（因为\ A始终匹配字符串的开头，无论re.M标志状态如何'）。

So re.match is a useful shortcut but not strictly necessary. It's especially confusing for Java programmers who have Pattern.matches() which anchors the search to the start and end of the string (which is probably a more common use case than just anchoring to the start).

所以re.match是一个有用的捷径，但并非绝对必要。对于拥有Pattern.matches（）的Java程序员来说尤其令人困惑，因为Pattern.matches（）将搜索锚定到字符串的开头和结尾（这可能是一个更常见的用例，而不仅仅是锚定到开头）。

It's different for the match and search methods of regex objects, though, as Eric has pointed out.

然而，正如Eric指出的那样，正则表达式对象的匹配和搜索方法不同。

#2

The pos argument behaves differently in important ways:

pos参数在重要方面表现不同：

>>> s = "a ab abc abcd"
>>> re.compile('a').match(s, pos=2)
<_sre.SRE_Match object; span=(2, 3), match='a'>
>>> re.compile('^a').search(s, pos=2)
None

match makes it possible to write a tokenizer, and ensure that characters are never skipped. search has no way of saying "start from the earliest allowable character".

match使得编写tokenizer成为可能，并确保永远不会跳过字符。搜索无法说“从最早的允许角色开始”。

Example use of match to break up a string with no gaps:

使用match来分解没有间隙的字符串的示例：

def tokenize(s, patt):
    at = 0
    while at < len(s):
        m = patt.match(s, pos=at)
        if not m:
            raise ValueError("Did not expect character at location {}".format(at))
        at = m.end()
        yield m

#1