Right now my regex is something like this:
现在我的正则表达式是这样的:
[a-zA-Z0-9] but it does not include accented characters like I would want to. I would also like - ' , to be included.
[a-zA-Z0-9]但它不包括我想要的重音字符。我也想 - ',包括在内。
3 个解决方案
#1
7
Accented Characters: DIY Character Range Subtraction
重音字符:DIY字符范围减法
If your regex engine allows it (and many will), this will work:
如果你的正则表达式引擎允许它(很多人会),这将有效:
(?i)^(?:(?![×Þß÷þø])[-'0-9a-zÀ-ÿ])+$
Please see the demo (you can add characters to test).
请参阅演示(您可以添加要测试的字符)。
Explanation
-
(?i)
sets case-insensitive mode - The
^
anchor asserts that we are at the beginning of the string -
(?:(?![×Þß÷þø])[-'0-9a-zÀ-ÿ])
matches one character... - The lookahead
(?![×Þß÷þø])
asserts that the char is not one of those in the brackets -
[-'0-9a-zÀ-ÿ]
allows dash, apostrophe, digits, letters, and chars in a wide accented range, from which we need to subtract - The
+
matches that one or more times - The
$
anchor asserts that we are at the end of the string
(?i)设置不区分大小写的模式
^ anchor断言我们在字符串的开头
(?:(?![×Þß÷þø])[ - '0-9a-zÀ-ÿ])匹配一个字符......
前瞻(?![×Þß÷þø])断言char不是括号中的一个
[-'0-9a-zÀ-ÿ]允许在宽重音范围内使用短划线,撇号,数字,字母和字符,我们需要从中减去
+匹配一次或多次
$ anchor断言我们在字符串的末尾
Reference
扩展ASCII表
#2
0
Use a POSIX character class (http://www.regular-expressions.info/posixbrackets.html):
使用POSIX字符类(http://www.regular-expressions.info/posixbrackets.html):
[-'[:alpha:]0-9]
or [-'[:alnum:]]
[ - '[:alpha:] 0-9]或[ - '[:alnum:]]
The [:alpha:]
character class matches whatever is considered "alphabetic characters" in your locale.
[:alpha:]字符类匹配您的语言环境中被视为“字母字符”的内容。
#3
0
A version without the exclusion rules:
没有排除规则的版本:
^[-'a-zA-ZÀ-ÖØ-öø-ÿ]+$
Explanation
- The
^
anchor asserts that we are at the beginning of the string -
[...]
allows dash, apostrophe, digits, letters, and chars in a wide accented range, - The
+
matches that one or more times - The
$
anchor asserts that we are at the end of the string
^ anchor断言我们在字符串的开头
[...]允许在宽重音范围内使用短划线,撇号,数字,字母和字符,
+匹配一次或多次
$ anchor断言我们在字符串的末尾
Reference
- Extended ASCII Table
扩展ASCII表
#1
7
Accented Characters: DIY Character Range Subtraction
重音字符:DIY字符范围减法
If your regex engine allows it (and many will), this will work:
如果你的正则表达式引擎允许它(很多人会),这将有效:
(?i)^(?:(?![×Þß÷þø])[-'0-9a-zÀ-ÿ])+$
Please see the demo (you can add characters to test).
请参阅演示(您可以添加要测试的字符)。
Explanation
-
(?i)
sets case-insensitive mode - The
^
anchor asserts that we are at the beginning of the string -
(?:(?![×Þß÷þø])[-'0-9a-zÀ-ÿ])
matches one character... - The lookahead
(?![×Þß÷þø])
asserts that the char is not one of those in the brackets -
[-'0-9a-zÀ-ÿ]
allows dash, apostrophe, digits, letters, and chars in a wide accented range, from which we need to subtract - The
+
matches that one or more times - The
$
anchor asserts that we are at the end of the string
(?i)设置不区分大小写的模式
^ anchor断言我们在字符串的开头
(?:(?![×Þß÷þø])[ - '0-9a-zÀ-ÿ])匹配一个字符......
前瞻(?![×Þß÷þø])断言char不是括号中的一个
[-'0-9a-zÀ-ÿ]允许在宽重音范围内使用短划线,撇号,数字,字母和字符,我们需要从中减去
+匹配一次或多次
$ anchor断言我们在字符串的末尾
Reference
扩展ASCII表
#2
0
Use a POSIX character class (http://www.regular-expressions.info/posixbrackets.html):
使用POSIX字符类(http://www.regular-expressions.info/posixbrackets.html):
[-'[:alpha:]0-9]
or [-'[:alnum:]]
[ - '[:alpha:] 0-9]或[ - '[:alnum:]]
The [:alpha:]
character class matches whatever is considered "alphabetic characters" in your locale.
[:alpha:]字符类匹配您的语言环境中被视为“字母字符”的内容。
#3
0
A version without the exclusion rules:
没有排除规则的版本:
^[-'a-zA-ZÀ-ÖØ-öø-ÿ]+$
Explanation
- The
^
anchor asserts that we are at the beginning of the string -
[...]
allows dash, apostrophe, digits, letters, and chars in a wide accented range, - The
+
matches that one or more times - The
$
anchor asserts that we are at the end of the string
^ anchor断言我们在字符串的开头
[...]允许在宽重音范围内使用短划线,撇号,数字,字母和字符,
+匹配一次或多次
$ anchor断言我们在字符串的末尾
Reference
- Extended ASCII Table
扩展ASCII表