This is my string:
这是我的字符串:
'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m'
I was using code to retrieve the output from a SSH command and I want my string to only contain 'examplefile.zip'
我使用代码从SSH命令检索输出,我希望我的字符串只包含'examplefile.zip'
What I can use to remove the extra escape sequences?
我可以用什么来移除额外的转义序列?
5 个解决方案
#1
65
Delete them with a regular expression:
用正则表达式删除它们:
import re
ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]')
ansi_escape.sub('', sometext)
Demo:
演示:
>>> import re
>>> ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]')
>>> sometext = 'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m'
>>> ansi_escape.sub('', sometext)
'ls\r\nexamplefile.zip\r\n'
(I've tidied up the escape sequence expression to follow the Wikipedia overview of ANSI escape codes, focusing on the CSI sequences, and ignoring the C1 codes as they are never used in today's UTF-8 world).
(我整理了转义序列的表达式,以跟随*对ANSI escape代码的概述,关注CSI序列,忽略了在今天的UTF-8世界中从未使用过的C1代码)。
#2
33
The accepted answer to this question only considers color and font effects. There are a lot of sequences that do not end in 'm', such as cursor positioning, erasing, and scroll regions.
这个问题的答案只考虑颜色和字体效果。有很多序列不以“m”结尾,比如光标定位、擦除和滚动区域。
The complete regexp for Control Sequences (aka ANSI Escape Sequences) is
控制序列的完整regexp(即ANSI转义序列)是。
/(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]/
Refer to ECMA-48 Section 5.4 and ANSI escape code
参考ECMA-48第5.4节和ANSI转义代码。
#3
14
Function
Based on Martijn Pieters♦'s answer with Jeff's regexp.
基于卡坦彼得♦回答与杰夫的regexp。
def escape_ansi(line):
ansi_escape = re.compile(r'(\x9B|\x1B\[)[0-?]*[ -/]*[@-~]')
return ansi_escape.sub('', line)
Test
def test_remove_ansi_escape_sequence(self):
line = '\t\u001b[0;35mBlabla\u001b[0m \u001b[0;36m172.18.0.2\u001b[0m'
escaped_line = escape_ansi(line)
self.assertEqual(escaped_line, '\tBlabla 172.18.0.2')
Testing
If you want to run it by yourself, use python3
(better unicode support, blablabla). Here is how the test file should be:
如果您想自己运行它,请使用python3(更好的unicode支持,blablabla)。下面是测试文件应该如何:
import unittest
import re
def escape_ansi(line):
…
class TestStringMethods(unittest.TestCase):
def test_remove_ansi_escape_sequence(self):
…
if __name__ == '__main__':
unittest.main()
#4
3
The suggested regex didn't do the trick for me so I created one of my own. The following is a python regex that I created based on the spec found here
建议的regex并没有为我做这个,所以我创建了自己的一个。下面是我根据在这里找到的规范创建的python regex。
ansi_regex = r'\x1b(' \
r'(\[\??\d+[hl])|' \
r'([=<>a-kzNM78])|' \
r'([\(\)][a-b0-2])|' \
r'(\[\d{0,2}[ma-dgkjqi])|' \
r'(\[\d+;\d+[hfy]?)|' \
r'(\[;?[hf])|' \
r'(#[3-68])|' \
r'([01356]n)|' \
r'(O[mlnp-z]?)|' \
r'(/Z)|' \
r'(\d+)|' \
r'(\[\?\d;\d0c)|' \
r'(\d;\dR))'
ansi_escape = re.compile(ansi_regex, flags=re.IGNORECASE)
I tested my regex on the following snippet (basically a copy paste from the ascii-table.com page)
我在下面的代码片段中测试了我的正则表达式(基本上是来自ascii-table.com页面的复制粘贴)
\x1b[20h Set
\x1b[?1h Set
\x1b[?3h Set
\x1b[?4h Set
\x1b[?5h Set
\x1b[?6h Set
\x1b[?7h Set
\x1b[?8h Set
\x1b[?9h Set
\x1b[20l Set
\x1b[?1l Set
\x1b[?2l Set
\x1b[?3l Set
\x1b[?4l Set
\x1b[?5l Set
\x1b[?6l Set
\x1b[?7l Reset
\x1b[?8l Reset
\x1b[?9l Reset
\x1b= Set
\x1b> Set
\x1b(A Set
\x1b)A Set
\x1b(B Set
\x1b)B Set
\x1b(0 Set
\x1b)0 Set
\x1b(1 Set
\x1b)1 Set
\x1b(2 Set
\x1b)2 Set
\x1bN Set
\x1bO Set
\x1b[m Turn
\x1b[0m Turn
\x1b[1m Turn
\x1b[2m Turn
\x1b[4m Turn
\x1b[5m Turn
\x1b[7m Turn
\x1b[8m Turn
\x1b[1;2 Set
\x1b[1A Move
\x1b[2B Move
\x1b[3C Move
\x1b[4D Move
\x1b[H Move
\x1b[;H Move
\x1b[4;3H Move
\x1b[f Move
\x1b[;f Move
\x1b[1;2 Move
\x1bD Move/scroll
\x1bM Move/scroll
\x1bE Move
\x1b7 Save
\x1b8 Restore
\x1bH Set
\x1b[g Clear
\x1b[0g Clear
\x1b[3g Clear
\x1b#3 Double-height
\x1b#4 Double-height
\x1b#5 Single
\x1b#6 Double
\x1b[K Clear
\x1b[0K Clear
\x1b[1K Clear
\x1b[2K Clear
\x1b[J Clear
\x1b[0J Clear
\x1b[1J Clear
\x1b[2J Clear
\x1b5n Device
\x1b0n Response:
\x1b3n Response:
\x1b6n Get
\x1b[c Identify
\x1b[0c Identify
\x1b[?1;20c Response:
\x1bc Reset
\x1b#8 Screen
\x1b[2;1y Confidence
\x1b[2;2y Confidence
\x1b[2;9y Repeat
\x1b[2;10y Repeat
\x1b[0q Turn
\x1b[1q Turn
\x1b[2q Turn
\x1b[3q Turn
\x1b[4q Turn
\x1b< Enter/exit
\x1b= Enter
\x1b> Exit
\x1bF Use
\x1bG Use
\x1bA Move
\x1bB Move
\x1bC Move
\x1bD Move
\x1bH Move
\x1b12 Move
\x1bI
\x1bK
\x1bJ
\x1bZ
\x1b/Z
\x1bOP
\x1bOQ
\x1bOR
\x1bOS
\x1bA
\x1bB
\x1bC
\x1bD
\x1bOp
\x1bOq
\x1bOr
\x1bOs
\x1bOt
\x1bOu
\x1bOv
\x1bOw
\x1bOx
\x1bOy
\x1bOm
\x1bOl
\x1bOn
\x1bOM
\x1b[i
\x1b[1i
\x1b[4i
\x1b[5i
Hopefully this will help others :)
希望这能帮助别人:)
#5
-1
if you want to remove the \r\n
bit, you can pass the string through this function (written by sarnold):
如果您想删除\r\n位,您可以通过这个函数传递字符串(由sarnold编写):
def stripEscape(string):
""" Removes all escape sequences from the input string """
delete = ""
i=1
while (i<0x20):
delete += chr(i)
i += 1
t = string.translate(None, delete)
return t
Careful though, this will lump together the text in front and behind the escape sequences. So, using Martijn's filtered string 'ls\r\nexamplefile.zip\r\n'
, you will get lsexamplefile.zip
. Note the ls
in front of the desired filename.
但是要小心,这将把文本放在前面和后面的转义序列后面。所以,使用Martijn的过滤字符串'ls\r\nexamplefile。你会得到lsexamplefile.zip。注意在需要的文件名前面的ls。
I would use the stripEscape function first to remove the escape sequences, then pass the output to Martijn's regular expression, which would avoid concatenating the unwanted bit.
我将首先使用stripEscape函数来删除转义序列,然后将输出传递给Martijn的正则表达式,这样可以避免将多余的部分连接起来。
#1
65
Delete them with a regular expression:
用正则表达式删除它们:
import re
ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]')
ansi_escape.sub('', sometext)
Demo:
演示:
>>> import re
>>> ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]')
>>> sometext = 'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m'
>>> ansi_escape.sub('', sometext)
'ls\r\nexamplefile.zip\r\n'
(I've tidied up the escape sequence expression to follow the Wikipedia overview of ANSI escape codes, focusing on the CSI sequences, and ignoring the C1 codes as they are never used in today's UTF-8 world).
(我整理了转义序列的表达式,以跟随*对ANSI escape代码的概述,关注CSI序列,忽略了在今天的UTF-8世界中从未使用过的C1代码)。
#2
33
The accepted answer to this question only considers color and font effects. There are a lot of sequences that do not end in 'm', such as cursor positioning, erasing, and scroll regions.
这个问题的答案只考虑颜色和字体效果。有很多序列不以“m”结尾,比如光标定位、擦除和滚动区域。
The complete regexp for Control Sequences (aka ANSI Escape Sequences) is
控制序列的完整regexp(即ANSI转义序列)是。
/(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]/
Refer to ECMA-48 Section 5.4 and ANSI escape code
参考ECMA-48第5.4节和ANSI转义代码。
#3
14
Function
Based on Martijn Pieters♦'s answer with Jeff's regexp.
基于卡坦彼得♦回答与杰夫的regexp。
def escape_ansi(line):
ansi_escape = re.compile(r'(\x9B|\x1B\[)[0-?]*[ -/]*[@-~]')
return ansi_escape.sub('', line)
Test
def test_remove_ansi_escape_sequence(self):
line = '\t\u001b[0;35mBlabla\u001b[0m \u001b[0;36m172.18.0.2\u001b[0m'
escaped_line = escape_ansi(line)
self.assertEqual(escaped_line, '\tBlabla 172.18.0.2')
Testing
If you want to run it by yourself, use python3
(better unicode support, blablabla). Here is how the test file should be:
如果您想自己运行它,请使用python3(更好的unicode支持,blablabla)。下面是测试文件应该如何:
import unittest
import re
def escape_ansi(line):
…
class TestStringMethods(unittest.TestCase):
def test_remove_ansi_escape_sequence(self):
…
if __name__ == '__main__':
unittest.main()
#4
3
The suggested regex didn't do the trick for me so I created one of my own. The following is a python regex that I created based on the spec found here
建议的regex并没有为我做这个,所以我创建了自己的一个。下面是我根据在这里找到的规范创建的python regex。
ansi_regex = r'\x1b(' \
r'(\[\??\d+[hl])|' \
r'([=<>a-kzNM78])|' \
r'([\(\)][a-b0-2])|' \
r'(\[\d{0,2}[ma-dgkjqi])|' \
r'(\[\d+;\d+[hfy]?)|' \
r'(\[;?[hf])|' \
r'(#[3-68])|' \
r'([01356]n)|' \
r'(O[mlnp-z]?)|' \
r'(/Z)|' \
r'(\d+)|' \
r'(\[\?\d;\d0c)|' \
r'(\d;\dR))'
ansi_escape = re.compile(ansi_regex, flags=re.IGNORECASE)
I tested my regex on the following snippet (basically a copy paste from the ascii-table.com page)
我在下面的代码片段中测试了我的正则表达式(基本上是来自ascii-table.com页面的复制粘贴)
\x1b[20h Set
\x1b[?1h Set
\x1b[?3h Set
\x1b[?4h Set
\x1b[?5h Set
\x1b[?6h Set
\x1b[?7h Set
\x1b[?8h Set
\x1b[?9h Set
\x1b[20l Set
\x1b[?1l Set
\x1b[?2l Set
\x1b[?3l Set
\x1b[?4l Set
\x1b[?5l Set
\x1b[?6l Set
\x1b[?7l Reset
\x1b[?8l Reset
\x1b[?9l Reset
\x1b= Set
\x1b> Set
\x1b(A Set
\x1b)A Set
\x1b(B Set
\x1b)B Set
\x1b(0 Set
\x1b)0 Set
\x1b(1 Set
\x1b)1 Set
\x1b(2 Set
\x1b)2 Set
\x1bN Set
\x1bO Set
\x1b[m Turn
\x1b[0m Turn
\x1b[1m Turn
\x1b[2m Turn
\x1b[4m Turn
\x1b[5m Turn
\x1b[7m Turn
\x1b[8m Turn
\x1b[1;2 Set
\x1b[1A Move
\x1b[2B Move
\x1b[3C Move
\x1b[4D Move
\x1b[H Move
\x1b[;H Move
\x1b[4;3H Move
\x1b[f Move
\x1b[;f Move
\x1b[1;2 Move
\x1bD Move/scroll
\x1bM Move/scroll
\x1bE Move
\x1b7 Save
\x1b8 Restore
\x1bH Set
\x1b[g Clear
\x1b[0g Clear
\x1b[3g Clear
\x1b#3 Double-height
\x1b#4 Double-height
\x1b#5 Single
\x1b#6 Double
\x1b[K Clear
\x1b[0K Clear
\x1b[1K Clear
\x1b[2K Clear
\x1b[J Clear
\x1b[0J Clear
\x1b[1J Clear
\x1b[2J Clear
\x1b5n Device
\x1b0n Response:
\x1b3n Response:
\x1b6n Get
\x1b[c Identify
\x1b[0c Identify
\x1b[?1;20c Response:
\x1bc Reset
\x1b#8 Screen
\x1b[2;1y Confidence
\x1b[2;2y Confidence
\x1b[2;9y Repeat
\x1b[2;10y Repeat
\x1b[0q Turn
\x1b[1q Turn
\x1b[2q Turn
\x1b[3q Turn
\x1b[4q Turn
\x1b< Enter/exit
\x1b= Enter
\x1b> Exit
\x1bF Use
\x1bG Use
\x1bA Move
\x1bB Move
\x1bC Move
\x1bD Move
\x1bH Move
\x1b12 Move
\x1bI
\x1bK
\x1bJ
\x1bZ
\x1b/Z
\x1bOP
\x1bOQ
\x1bOR
\x1bOS
\x1bA
\x1bB
\x1bC
\x1bD
\x1bOp
\x1bOq
\x1bOr
\x1bOs
\x1bOt
\x1bOu
\x1bOv
\x1bOw
\x1bOx
\x1bOy
\x1bOm
\x1bOl
\x1bOn
\x1bOM
\x1b[i
\x1b[1i
\x1b[4i
\x1b[5i
Hopefully this will help others :)
希望这能帮助别人:)
#5
-1
if you want to remove the \r\n
bit, you can pass the string through this function (written by sarnold):
如果您想删除\r\n位,您可以通过这个函数传递字符串(由sarnold编写):
def stripEscape(string):
""" Removes all escape sequences from the input string """
delete = ""
i=1
while (i<0x20):
delete += chr(i)
i += 1
t = string.translate(None, delete)
return t
Careful though, this will lump together the text in front and behind the escape sequences. So, using Martijn's filtered string 'ls\r\nexamplefile.zip\r\n'
, you will get lsexamplefile.zip
. Note the ls
in front of the desired filename.
但是要小心,这将把文本放在前面和后面的转义序列后面。所以,使用Martijn的过滤字符串'ls\r\nexamplefile。你会得到lsexamplefile.zip。注意在需要的文件名前面的ls。
I would use the stripEscape function first to remove the escape sequences, then pass the output to Martijn's regular expression, which would avoid concatenating the unwanted bit.
我将首先使用stripEscape函数来删除转义序列,然后将输出传递给Martijn的正则表达式,这样可以避免将多余的部分连接起来。