I have a string like this
我有这样的字符串
"yJdz:jkj8h:jkhd::hjkjh"
I want to split it using colon as a separator, but not a double colon. Desired result:
我想用冒号作为分隔符拆分它,但不是双冒号。期望的结果:
("yJdz", "jkj8h", "jkhd::hjkjh")
I'm trying with:
我正在尝试:
re.split(":{1}", "yJdz:jkj8h:jkhd::hjkjh")
but I got a wrong result.
但是我得到了错误的结果。
In the meanwhile I'm escaping "::"
, with string.replace("::", "$$")
同时我用string.replace(“::”,“$$”)转义“::”
2 个解决方案
#1
22
You could split on (?<!:):(?!:)
. This uses two negative lookarounds (a lookbehind and a lookahead) which assert that a valid match only has one colon, without a colon before or after it.
你可以拆分(?<!:) :(?!:)。这使用了两个负面的外观(一个lookbehind和一个前瞻),它断言有效匹配只有一个冒号,在它之前或之后没有冒号。
To explain the pattern:
解释模式:
(?<!:) # assert that the previous character is not a colon
: # match a literal : character
(?!:) # assert that the next character is not a colon
Both lookarounds are needed, because if there was only the lookbehind, then the regular expression engine would match the first colon in ::
(because the previous character isn't a colon), and if there was only the lookahead, the second colon would match (because the next character isn't a colon).
两个外观是必需的,因为如果只有lookbehind,那么正则表达式引擎将匹配::中的第一个冒号(因为前一个字符不是冒号),如果只有前瞻,第二个冒号将匹配(因为下一个字符不是冒号)。
#2
11
You can do this with lookahead and lookbehind, if you want:
如果你想要的话,你可以通过前瞻和后瞻来做到这一点:
>>> s = "yJdz:jkj8h:jkhd::hjkjh"
>>> l = re.split("(?<!:):(?!:)", s)
>>> print l
['yJdz', 'jkj8h', 'jkhd::hjkjh']
This regex essentially says "match a :
that is not followed by a :
or preceded by a :
"
这个正则表达式基本上是说“匹配a:后面没有a:或者前面跟着:”
#1
22
You could split on (?<!:):(?!:)
. This uses two negative lookarounds (a lookbehind and a lookahead) which assert that a valid match only has one colon, without a colon before or after it.
你可以拆分(?<!:) :(?!:)。这使用了两个负面的外观(一个lookbehind和一个前瞻),它断言有效匹配只有一个冒号,在它之前或之后没有冒号。
To explain the pattern:
解释模式:
(?<!:) # assert that the previous character is not a colon
: # match a literal : character
(?!:) # assert that the next character is not a colon
Both lookarounds are needed, because if there was only the lookbehind, then the regular expression engine would match the first colon in ::
(because the previous character isn't a colon), and if there was only the lookahead, the second colon would match (because the next character isn't a colon).
两个外观是必需的,因为如果只有lookbehind,那么正则表达式引擎将匹配::中的第一个冒号(因为前一个字符不是冒号),如果只有前瞻,第二个冒号将匹配(因为下一个字符不是冒号)。
#2
11
You can do this with lookahead and lookbehind, if you want:
如果你想要的话,你可以通过前瞻和后瞻来做到这一点:
>>> s = "yJdz:jkj8h:jkhd::hjkjh"
>>> l = re.split("(?<!:):(?!:)", s)
>>> print l
['yJdz', 'jkj8h', 'jkhd::hjkjh']
This regex essentially says "match a :
that is not followed by a :
or preceded by a :
"
这个正则表达式基本上是说“匹配a:后面没有a:或者前面跟着:”