正则表达式,允许使用任何语言的字母(如“ñ”)

时间:2021-09-05 13:09:10

trying to let users use special characters in other languages such as Spanish or French. I originally had this:

试图让用户使用其他语言的特殊字符,如西班牙语或法语。我原来有这个:

 "/[^A-Za-z0-9\.\_\- ]/i" 

and then changed it to

然后将其更改为

 "/[^\p{L}\p{N}\.\_\-\(\) ]/i" 

but still doesn't work. letters such as "ñ" should be allowed. Thanks.

但仍然无法正常工作。应允许使用诸如“ñ”之类的字母。谢谢。

Revision: I found that adding a (*UTF8) at the beginning helps solve the problem. So I'm using the following code:"/(*UTF8)[^\p{L}A-Za-z0-9._- ]/i"

修订:我发现在开头添加(* UTF8)有助于解决问题。所以我使用以下代码:“/(* UTF8)[^ \ p {L} A-Za-z0-9 ._-] / i”

Revision: After looking at the answers I decided to use: "/[^\p{Xwd}. -]/u". Thanks(It works even with the Chinese alphabet.

修订:看完答案后我决定使用:“/ [^ \ p {Xwd}。 - ] / u”。谢谢(它甚至可以使用中文字母表。

2 个解决方案

#1


3  

for latin languages you can use the \p{Latin} character class:

对于拉丁语言,您可以使用\ p {Latin}字符类:

/[^\p{Latin}0-9._ -]/u

But if you want all other letters and digits:

但如果你想要所有其他字母和数字:

/[^\p{Xwd}. -]/u

The "u" modifier indicates that the string must be read as an unicode string.

“u”修饰符表示该字符串必须作为unicode字符串读取。

#2


0  

You could also look into specifying a unicode range, ie. [\w\u00C0-\u024F.-]+ to include Latin extended letters. But it's hard to try and restrict characters to such a broad subset; what about Chinese, Vietnamese, etc.? I'm with Dagon on this one – best to allow anything.

您还可以考虑指定unicode范围,即。 [\ w \ u00C0- \ u024F .-] +包括拉丁语扩展字母。但是很难尝试将字符限制在如此广泛的子集中;中国人,越南人等等呢?我和Dagon在这一个 - 最好允许任何事情。

#1


3  

for latin languages you can use the \p{Latin} character class:

对于拉丁语言,您可以使用\ p {Latin}字符类:

/[^\p{Latin}0-9._ -]/u

But if you want all other letters and digits:

但如果你想要所有其他字母和数字:

/[^\p{Xwd}. -]/u

The "u" modifier indicates that the string must be read as an unicode string.

“u”修饰符表示该字符串必须作为unicode字符串读取。

#2


0  

You could also look into specifying a unicode range, ie. [\w\u00C0-\u024F.-]+ to include Latin extended letters. But it's hard to try and restrict characters to such a broad subset; what about Chinese, Vietnamese, etc.? I'm with Dagon on this one – best to allow anything.

您还可以考虑指定unicode范围,即。 [\ w \ u00C0- \ u024F .-] +包括拉丁语扩展字母。但是很难尝试将字符限制在如此广泛的子集中;中国人,越南人等等呢?我和Dagon在这一个 - 最好允许任何事情。