I need to split a string by seperators that are known to me and also unknown. For example i know i want to split the string by "\n" and "," and "." but also 1 sperator that can be user defined: for example it can be ";" or "hello" or pretty much anything.
我需要用我知道但也不为人知的分隔符来分割字符串。例如,我知道我想用“\ n”和“,”和“。”分割字符串。还有1个可以由用户定义的sperator:例如它可以是“;”或“你好”或几乎任何东西。
I tried this:
我试过这个:
"[\n|,|.|".$exp."]"
...but that didnt work as expected. As i understand | means or. So this reg exp should say that split by "\n" or "," or "." or "hello". I think its because if i try just [hello] then it splits by every letter, not the whole word. Thats strange because if i try just [\n] then it only splits by "\n" - not by "\" or "n".
......但是没有按预期工作。据我所知|意思是。所以这个reg exp应该用“\ n”或“,”或“。”分隔。或“你好”。我认为它是因为如果我尝试[hello]那么它会按每个字母而不是整个字分开。这很奇怪,因为如果我只尝试[\ n],那么它只会被“\ n”拆分 - 而不是“\”或“n”。
Can someone please explain this to me? :)
有人可以向我解释一下吗? :)
6 个解决方案
#1
6
When you place a bunch of characters in a character class, as in [hello]
, this defines a token that matches one character that is either h, e, l or o. Also, |
has no meaning inside of a character class - it's just matched as a normal character.
当您在字符类中放置一堆字符时,如[hello]中所示,这定义了一个匹配一个字符h h,e,l或o的标记。另外,|在字符类中没有任何意义 - 它只是作为普通字符匹配。
The correct solution isn't to use a character class - you meant to use normal brackets:
正确的解决方案是不使用字符类 - 您打算使用普通括号:
(\n|,|\.|".$exp.")
By the way - make sure that you escape any regex metacharacters that are in $exp
. Basically, the full list here needs to be escaped with backslashes: http://regular-expressions.info/reference.html There may be a helper function to do it for you.
顺便说一句 - 确保你逃避$ exp中的任何正则表达式元字符。基本上,这里的完整列表需要使用反斜杠进行转义:http://regular-expressions.info/reference.html可能有一个辅助函数来为您执行此操作。
EDIT: Since you're not using a character class, we now need to escape \
the .
which is now a metacharacter meaning 'match one of anything'. Almost forgot.
编辑:因为你没有使用字符类,我们现在需要逃避\。现在这是一个元字符,意思是“匹配任何东西”。差点忘了。
#2
1
\n
is actually only one character, a new line, (the \
before the n
indicates an escape sequence) so that's why it works and hello
doesn't.
\ n实际上只有一个字符,一个新行,(在n之前表示一个转义序列),这就是为什么它可以工作而hello不能。
Also, keep in mind that allowing arbitrary input into a regular expression can be a security risk, depending on what your regular expression is being used for, so be very careful and make sure you sanitize your input to that regular expression.
另外,请记住,允许任意输入正则表达式可能存在安全风险,具体取决于正则表达式的用途,因此请务必小心并确保清理对该正则表达式的输入。
#3
1
Try using this regex:
尝试使用这个正则表达式:
preg_split('#[\n,.]|'.$exp.'#', ...);
Note the single quots, to avoid \n
getting replaced by the new line.
注意单个小数,以避免被新行替换。
#4
1
Drop the [
and ]
as these define a character class. \n
counts as a single character in a double-quoted string. Just using the string without the character class should work as you need:
删除[和],因为这些定义了一个字符类。 \ n计为双引号字符串中的单个字符。只使用不带字符类的字符串应该可以根据需要使用:
preg_split("/\n|,|.|$exp/", $input)
#5
1
Use preg_split()
For example:
Input:
$exp = '#';
preg_split("/[,.\n$exp]/", "0\n1,2.3#4")
Output:
Array ( [0] => 0 [1] => 1 [2] => 2 [3] => 3 [4] => 4)
#6
1
here is a simple solution:
这是一个简单的解决方案:
"(\n|,|\.|".$exp.")"
or you can do it like:
或者你可以这样做:
"([\n,.]|".$exp.")"
#1
6
When you place a bunch of characters in a character class, as in [hello]
, this defines a token that matches one character that is either h, e, l or o. Also, |
has no meaning inside of a character class - it's just matched as a normal character.
当您在字符类中放置一堆字符时,如[hello]中所示,这定义了一个匹配一个字符h h,e,l或o的标记。另外,|在字符类中没有任何意义 - 它只是作为普通字符匹配。
The correct solution isn't to use a character class - you meant to use normal brackets:
正确的解决方案是不使用字符类 - 您打算使用普通括号:
(\n|,|\.|".$exp.")
By the way - make sure that you escape any regex metacharacters that are in $exp
. Basically, the full list here needs to be escaped with backslashes: http://regular-expressions.info/reference.html There may be a helper function to do it for you.
顺便说一句 - 确保你逃避$ exp中的任何正则表达式元字符。基本上,这里的完整列表需要使用反斜杠进行转义:http://regular-expressions.info/reference.html可能有一个辅助函数来为您执行此操作。
EDIT: Since you're not using a character class, we now need to escape \
the .
which is now a metacharacter meaning 'match one of anything'. Almost forgot.
编辑:因为你没有使用字符类,我们现在需要逃避\。现在这是一个元字符,意思是“匹配任何东西”。差点忘了。
#2
1
\n
is actually only one character, a new line, (the \
before the n
indicates an escape sequence) so that's why it works and hello
doesn't.
\ n实际上只有一个字符,一个新行,(在n之前表示一个转义序列),这就是为什么它可以工作而hello不能。
Also, keep in mind that allowing arbitrary input into a regular expression can be a security risk, depending on what your regular expression is being used for, so be very careful and make sure you sanitize your input to that regular expression.
另外,请记住,允许任意输入正则表达式可能存在安全风险,具体取决于正则表达式的用途,因此请务必小心并确保清理对该正则表达式的输入。
#3
1
Try using this regex:
尝试使用这个正则表达式:
preg_split('#[\n,.]|'.$exp.'#', ...);
Note the single quots, to avoid \n
getting replaced by the new line.
注意单个小数,以避免被新行替换。
#4
1
Drop the [
and ]
as these define a character class. \n
counts as a single character in a double-quoted string. Just using the string without the character class should work as you need:
删除[和],因为这些定义了一个字符类。 \ n计为双引号字符串中的单个字符。只使用不带字符类的字符串应该可以根据需要使用:
preg_split("/\n|,|.|$exp/", $input)
#5
1
Use preg_split()
For example:
Input:
$exp = '#';
preg_split("/[,.\n$exp]/", "0\n1,2.3#4")
Output:
Array ( [0] => 0 [1] => 1 [2] => 2 [3] => 3 [4] => 4)
#6
1
here is a simple solution:
这是一个简单的解决方案:
"(\n|,|\.|".$exp.")"
or you can do it like:
或者你可以这样做:
"([\n,.]|".$exp.")"