I'm having a really weird problem with preg_replace here (and as far as I can remember, this isn't the first time I've seen this). I have an XML with an element with invalid structure (closing tag is missing the slash, breaks parser):
我在preg_replace上遇到了一个非常奇怪的问题(据我所知,这不是我第一次看到这个问题)。我有一个具有无效结构的元素的XML(结束标记缺少斜线、断线解析器):
<info>
<datetime>2013.04.12 12:04:02</datetime>
<info>
What I'm trying to do is this: $xml = preg_replace('/<info>.*<info>/iu', '', $xml)
(because I don't actually need that element), but IT DOES NOT REPLACE.
How do I make it work?
我要做的是:$xml = preg_replace('/
4 个解决方案
#1
3
Add the s
modifier and use ?
to make it non-greedy:
添加修改器并使用?让它贪婪的:
$string = '<info>
<datetime>2013.04.12 12:04:02</datetime>
<info>
<valid>2013.04.12 12:04:02</valid>
<info>
<datetime>2013.04.12 12:04:02</datetime>
<info>';
var_dump(preg_replace('/<info>.*?<info>/s', '', $string));
#2
4
Try adding the s
modifier to the regex rule. Will not stop matching at new line
尝试在regex规则中添加s修饰符。在新线上不会停止匹配吗?
#3
4
It doesn't replace becase there aren't matches:
它不能代替因为没有匹配:
<?php
$xml = '<info>
<datetime>2013.04.12 12:04:02</datetime>
<info>';
var_dump(preg_match('/<info>.*<info>/iu', $xml, $matches), $matches);
int(0)
array(0) {
}
Let's see what's wrong. What does .
mean exactly?
看看有什么问题。什么。究竟意味着什么?
match any character except newline (by default)
匹配除新行以外的任何字符(默认情况下)
So there it is! How do you change the default? We have a look at the available internal options and find this:
所以啊!如何更改默认值?我们查看了可用的内部选项,发现如下:
s
forPCRE_DOTALL
年代PCRE_DOTALL
.... where PCRE_DOTALL means:
....PCRE_DOTALL意味着:
s (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded.如果设置了这个修饰符,模式中的点元字符将匹配所有字符,包括换行符。没有它,就排除了换行。
We can change it locally:
我们可以在本地更改:
'/<info>(?s:.*)<info>/iu'
^
... or globally:
…或在全球范围内:
'/<info>.*<info>/ius'
^
#4
2
See http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
参见http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
You need to use the s modifier at the end of your regex.
您需要在regex末尾使用s修饰符。
$xml = preg_replace('/<info>.*<info>/ius', '', $xml);
#1
3
Add the s
modifier and use ?
to make it non-greedy:
添加修改器并使用?让它贪婪的:
$string = '<info>
<datetime>2013.04.12 12:04:02</datetime>
<info>
<valid>2013.04.12 12:04:02</valid>
<info>
<datetime>2013.04.12 12:04:02</datetime>
<info>';
var_dump(preg_replace('/<info>.*?<info>/s', '', $string));
#2
4
Try adding the s
modifier to the regex rule. Will not stop matching at new line
尝试在regex规则中添加s修饰符。在新线上不会停止匹配吗?
#3
4
It doesn't replace becase there aren't matches:
它不能代替因为没有匹配:
<?php
$xml = '<info>
<datetime>2013.04.12 12:04:02</datetime>
<info>';
var_dump(preg_match('/<info>.*<info>/iu', $xml, $matches), $matches);
int(0)
array(0) {
}
Let's see what's wrong. What does .
mean exactly?
看看有什么问题。什么。究竟意味着什么?
match any character except newline (by default)
匹配除新行以外的任何字符(默认情况下)
So there it is! How do you change the default? We have a look at the available internal options and find this:
所以啊!如何更改默认值?我们查看了可用的内部选项,发现如下:
s
forPCRE_DOTALL
年代PCRE_DOTALL
.... where PCRE_DOTALL means:
....PCRE_DOTALL意味着:
s (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded.如果设置了这个修饰符,模式中的点元字符将匹配所有字符,包括换行符。没有它,就排除了换行。
We can change it locally:
我们可以在本地更改:
'/<info>(?s:.*)<info>/iu'
^
... or globally:
…或在全球范围内:
'/<info>.*<info>/ius'
^
#4
2
See http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
参见http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
You need to use the s modifier at the end of your regex.
您需要在regex末尾使用s修饰符。
$xml = preg_replace('/<info>.*<info>/ius', '', $xml);