为什么这个正则表达式在PHP中不起作用?

时间:2020-12-02 21:45:03

I need to match (case insensitive) "abcd" and an optional trademark symbol

我需要匹配(不区分大小写)“abcd”和可选的商标符号

Regex: /abcd(™)?/gi

See example:

preg_match("/abcd(™)?/gi","AbCd™  U9+",$matches);
print_r($matches);

When I run this, $matches isn't populated with anything... Not even created as an empty array. Any ideas?

当我运行它时,$ matches没有填充任何东西......甚至没有创建为空数组。有任何想法吗?

4 个解决方案

#1


5  

How is your file encoded? PHP has got issues when it comes to unicode. In your case, try using the escape sequence \x99 instead of directly embedding the TM symbol.

你的文件是如何编码的?在unicode方面,PHP遇到了问题。在您的情况下,尝试使用转义序列\ x99而不是直接嵌入TM符号。

#2


3  

Note: I'm not a PHP guru. However, this seems to be an issue about character encodings. For example, your PHP file could be encoded as win-1252 (where ™ is encoded as \x99), and the data you are trying to match could be encoded as UTF-8 (where ™ is encoded as \xe2\x84\xa2), or vice versa (i.e. your file is UTF-8 and your data is win-1252). Try looking in this direction, and give us more information about what you are doing.

注意:我不是PHP大师。但是,这似乎是关于字符编码的问题。例如,您的PHP文件可以编码为win-1252(其中™编码为\ x99),您尝试匹配的数据可以编码为UTF-8(其中™编码为\ xe2 \ x84 \ xa2 ),反之亦然(即您的文件是UTF-8,您的数据是win-1252)。尝试朝这个方向看,并向我们提供有关您正在做的事情的更多信息。

#3


2  

I suspect it has something to do with the literal trademark symbol.

我怀疑它与文字商标符号有关。

You'll probably want to check out how to use Unicode with your regular expressions, and then embed the escape sequence for the trademark symbol.

您可能想要查看如何将Unicode与正则表达式一起使用,然后嵌入商标符号的转义序列。

#4


2  

It was a combination of things... this was the regex that finally worked:

这是事情的组合......这是最终起作用的正则表达式:

/abcd(\xe2\x84\xa2)?/i

I had to remove /g modifier and change the tm symbol to \xe2\x84\xa2.

我不得不删除/ g修饰符并将tm符号更改为\ xe2 \ x84 \ xa2。

#1


5  

How is your file encoded? PHP has got issues when it comes to unicode. In your case, try using the escape sequence \x99 instead of directly embedding the TM symbol.

你的文件是如何编码的?在unicode方面,PHP遇到了问题。在您的情况下,尝试使用转义序列\ x99而不是直接嵌入TM符号。

#2


3  

Note: I'm not a PHP guru. However, this seems to be an issue about character encodings. For example, your PHP file could be encoded as win-1252 (where ™ is encoded as \x99), and the data you are trying to match could be encoded as UTF-8 (where ™ is encoded as \xe2\x84\xa2), or vice versa (i.e. your file is UTF-8 and your data is win-1252). Try looking in this direction, and give us more information about what you are doing.

注意:我不是PHP大师。但是,这似乎是关于字符编码的问题。例如,您的PHP文件可以编码为win-1252(其中™编码为\ x99),您尝试匹配的数据可以编码为UTF-8(其中™编码为\ xe2 \ x84 \ xa2 ),反之亦然(即您的文件是UTF-8,您的数据是win-1252)。尝试朝这个方向看,并向我们提供有关您正在做的事情的更多信息。

#3


2  

I suspect it has something to do with the literal trademark symbol.

我怀疑它与文字商标符号有关。

You'll probably want to check out how to use Unicode with your regular expressions, and then embed the escape sequence for the trademark symbol.

您可能想要查看如何将Unicode与正则表达式一起使用,然后嵌入商标符号的转义序列。

#4


2  

It was a combination of things... this was the regex that finally worked:

这是事情的组合......这是最终起作用的正则表达式:

/abcd(\xe2\x84\xa2)?/i

I had to remove /g modifier and change the tm symbol to \xe2\x84\xa2.

我不得不删除/ g修饰符并将tm符号更改为\ xe2 \ x84 \ xa2。