I am looking for a very particular RegEx (or another solution, close in performance) in Python to substitute patterns, which are in the following examples:
我在Python中寻找一个非常特殊的RegEx(或另一个性能接近的解决方案)来替代模式,如下面的示例所示:
...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,.
Basically:
基本上:
AnySymbol (not only dots and commas), followed by one +/- sign, followed by one letter digit (1..9), followed by several letters, the number of which is dependent on the previous number and finally AnySymbol (not only dots and commas),
任何符号(不只是点和逗号),后面跟着一个+/-符号,后面跟着一个字母数字(1.. .9),后面跟着几个字母,这些字母的数量取决于前面的数字,最后是任何符号(不仅仅是点和逗号),
should be transformed to:
应该转化为:
AnySymbol (not only dots and commas) and AnySymbol (not only dots and commas).
任何符号(不只是点和逗号)和任何符号(不只是点和逗号)。
Obviously the solution: String = re.sub(r'[\-\+]\d\w+', "", String)
is not right, if we have case (...-1AG.,., should be transformed as ...G.,.,)
. So far I am looping over r'[\-\+]1\w', r'[\-\+]2\w\w', r'[\-\+]3\w\w\w' ... r'[\-\+]9\w\w\w\w\w\w\w\w\w'
, however I am hoping for more elegant solution. Any ideas?
显然,如果我们有case(…- 1ag .),解决方案是:String = re.sub(r'[\-\+]\d\w+', "", String)不正确。,应该转换为…G.,。到目前为止我循环/ r '[\ - \ +]1 \ w ',r '(\)\ +)2 \ w \ w ',r '[\ - \ +]3 \ w \ w \ w”……什么好主意吗?
1 个解决方案
#1
3
Have a look at this working demo.
看看这个工作演示。
x="""...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,."""
def repl(matchobj):
return matchobj.group(2)[int(matchobj.group(1)):]
print re.sub(r"[+-](\d+)([a-zA-Z]+)",repl,x)
You can use your own function in re.sub
to make customized
replacements.
您可以在re.sub中使用自己的函数来定制替换。
#1
3
Have a look at this working demo.
看看这个工作演示。
x="""...-1AG.,., should be transformed as ...G.,.,
..,-1A,.,., should be transformed as ..,,.,.,
...-2GTC,., should be transformed as ...C,.,
..,-2GT.,., should be transformed as ..,.,.,
...+3TAGT,, should be transformed as ...T,,
..,+3TAG.,. should be transformed as ..,.,."""
def repl(matchobj):
return matchobj.group(2)[int(matchobj.group(1)):]
print re.sub(r"[+-](\d+)([a-zA-Z]+)",repl,x)
You can use your own function in re.sub
to make customized
replacements.
您可以在re.sub中使用自己的函数来定制替换。