I'm writing a callback function for CodeIgniter's form validation. Only letters, numbers, dash, underscore and space are allowed. I'm currently using this regex:
我正在为CodeIgniter的表单验证编写回调函数。只允许字母、数字、破折号、下划线和空格。我正在使用这个regex:
preg_match("/^([-a-z_ ])+$/i", $string)
But it won't work with non-ASCII charachters like č š ć đ ž â etc. It's a field to enter name and surname and it has to take all these non-ascii charachters as well. How to modify this regex to include those charachters as well?
但不会使用非ascii charachters像čšćđž等等。这是一个字段输入名字和姓氏,所有这些非ascii charachters。如何修改这个regex以包含这些charachter ?
3 个解决方案
#1
3
You can use unicode letter and unicode number properties for this:
你可以使用unicode字母和unicode编号属性:
preg_match('/^([-_ \p{L}\p{N}])+$/iu', $string)
Update: You may not need a capturing group here:
更新:您可能不需要一个捕捉组:
preg_match('/^[-_ \p{L}\p{N}]+$/iu', $string)
#2
0
According to http://us2.php.net/manual/ro/reference.pcre.pattern.modifiers.php
据http://us2.php.net/manual/ro/reference.pcre.pattern.modifiers.php
u just need to use unicode modifier:
你只需要使用unicode修饰符:
preg_match("/^([-a-z_ ])+$/ui", $string)
#3
0
Use u
modifier and \p{L}
and to add numbers, you may use [0-9]
or \p{N}
:
使用u修改器和\p{L}添加数字,可以使用[0-9]或\p{N}:
preg_match('/^[-\p{L}\p{N}_ ]+$/u', $string)
^^^^^^^^^^ ^
Note that you do not want to create too much overhead with too many capturing groups. I removed round brackets to achieve best performance. i
modifier is redundant since there is no literal letter in the pattern.
注意,您不希望使用太多的捕获组来创建过多的开销。为了达到最好的性能,我去掉了圆括号。我修改器是多余的,因为模式中没有文字字母。
See demo
看到演示
My regex performance:
我的regex性能:
Anubhava正则表达式:
#1
3
You can use unicode letter and unicode number properties for this:
你可以使用unicode字母和unicode编号属性:
preg_match('/^([-_ \p{L}\p{N}])+$/iu', $string)
Update: You may not need a capturing group here:
更新:您可能不需要一个捕捉组:
preg_match('/^[-_ \p{L}\p{N}]+$/iu', $string)
#2
0
According to http://us2.php.net/manual/ro/reference.pcre.pattern.modifiers.php
据http://us2.php.net/manual/ro/reference.pcre.pattern.modifiers.php
u just need to use unicode modifier:
你只需要使用unicode修饰符:
preg_match("/^([-a-z_ ])+$/ui", $string)
#3
0
Use u
modifier and \p{L}
and to add numbers, you may use [0-9]
or \p{N}
:
使用u修改器和\p{L}添加数字,可以使用[0-9]或\p{N}:
preg_match('/^[-\p{L}\p{N}_ ]+$/u', $string)
^^^^^^^^^^ ^
Note that you do not want to create too much overhead with too many capturing groups. I removed round brackets to achieve best performance. i
modifier is redundant since there is no literal letter in the pattern.
注意,您不希望使用太多的捕获组来创建过多的开销。为了达到最好的性能,我去掉了圆括号。我修改器是多余的,因为模式中没有文字字母。
See demo
看到演示
My regex performance:
我的regex性能:
Anubhava正则表达式: