在PHP中使用preg_replace的正则表达式

时间:2022-05-02 22:26:09

It's a long time ago that I haven't played with PHP and regex and I'd like to find a regex that would do the following work.

很久以前,我还没有使用PHP和正则表达式,我想找到一个可以完成以下工作的正则表达式。

My string contains :

我的字符串包含:

<pre code="...">some piece of code</pre> other non code content <pre code="...">some piece of code</pre> other non code content...

The goal is to replace all the <pre>code</pre> by&

目标是用&替换所有

代码
 
 
 

code
...`

Where "code" inside the <pre>&</pre> should also be escaped with htmlspecialchars...

还应使用htmlspecialchars转义

&
 
 
 中的“code”...

I've already tried a few regex but didn't succeed.

我已经尝试了一些正则表达式但没有成功。

Any idea?

Thanks

1 个解决方案

#1


1  

Generally, using RegEx to parse HTML is a bad idea. There are plenty of simple scenarios, where RegEx is enough to solve a particular problem, and that is great.

通常,使用RegEx来解析HTML是一个坏主意。有很多简单的场景,其中RegEx足以解决特定问题,这很好。

I would argue that in your case using RegEx is a bad idea, it will not cover all cases and it is likely insecure. You are possibly trying to prevent XSS vulnerabilities, and RegEx based solutions are always error-prone.

我认为,在你的情况下,使用RegEx是一个坏主意,它不会涵盖所有情况,它可能是不安全的。您可能正在尝试阻止XSS漏洞,而基于RegEx的解决方案总是容易出错。

But for completeness sake:

但为了完整起见:

preg_replace_callback(
    '/(<\\s*pre(?:\\s[^>]+)?>)(.*?)(<\\/\s*pre\s*>)/',
    function ($match) {
        return $match[1].htmlspecialchars($match[2]).$match[3];
    },
    $html
);

#1


1  

Generally, using RegEx to parse HTML is a bad idea. There are plenty of simple scenarios, where RegEx is enough to solve a particular problem, and that is great.

通常,使用RegEx来解析HTML是一个坏主意。有很多简单的场景,其中RegEx足以解决特定问题,这很好。

I would argue that in your case using RegEx is a bad idea, it will not cover all cases and it is likely insecure. You are possibly trying to prevent XSS vulnerabilities, and RegEx based solutions are always error-prone.

我认为,在你的情况下,使用RegEx是一个坏主意,它不会涵盖所有情况,它可能是不安全的。您可能正在尝试阻止XSS漏洞,而基于RegEx的解决方案总是容易出错。

But for completeness sake:

但为了完整起见:

preg_replace_callback(
    '/(<\\s*pre(?:\\s[^>]+)?>)(.*?)(<\\/\s*pre\s*>)/',
    function ($match) {
        return $match[1].htmlspecialchars($match[2]).$match[3];
    },
    $html
);