We want to remove ^[
, and all of the escape sequences.
我们想要删除^[,所有的转义序列。
sed is not working and is giving us this error:
sed没有工作,并给了我们这个错误:
$ sed 's/^[//g' oldfile > newfile; mv newfile oldfile;
sed: -e expression #1, char 7: unterminated `s' command
$ sed -i '' -e 's/^[//g' somefile
sed: -e expression #1, char 7: unterminated `s' command
8 个解决方案
#1
33
Are you looking for ansifilter?
你在找ansifilter吗?
Two things you can do: enter the literal escape (in bash:)
您可以做两件事:输入文字转义符(在bash:)
Using keyboard entry:
使用键盘输入:
sed 's/Ctrl-vEsc//g'
sed ' s / Ctrl-vEsc / / g '
alternatively
另外
sed 's/Ctrl-vCtrl-[//g'
sed ' s / Ctrl-vCtrl -[/ / g '
Or you can use character escapes:
或者你可以使用字符转义:
sed 's/\x1b//g'
or for all control characters:
或所有控制字符:
sed 's/[\x01-\x1F\x7F]//g' # NOTE: zaps TAB character too!
#2
15
I managed with the following for my purposes, but this doesn't include all possible ANSI escapes:
出于我的目的,我使用了以下方法,但这并不包括所有可能的ANSI转义:
sed -r s/\x1b\[[0-9;]*m?//g
This removes m
commands, but for all escapes (as commented by @lethalman) use:
这将删除m个命令,但是对于所有转义(如@lethalman评论)使用:
sed -r s/\x1b\[[^@-~]*[@-~]//g
Also see "Python regex to match VT100 escape sequences".
请参见“匹配VT100转义序列的Python regex”。
There is also a table of common escape sequences.
还有一个常见的转义序列表。
#3
8
ansi2txt command (part of kbtin package) seems to be doing the job perfectly on Ubuntu.
ansi2txt命令(kbtin包的一部分)似乎在Ubuntu上做得很好。
#4
6
commandlinefu gives the correct answer which strips ANSI colours as well as movement commands:
commandlinefu给出了正确的答案,它带了ANSI的颜色和移动的命令:
sed "s,\x1B\[[0-9;]*[a-zA-Z],,g"
#5
4
I've stumbled upon this post when looking for a way to strip extra formatting from man pages. ansifilter did it, but it was far from desired result (for example all previously-bold characters were duplicated, like SSYYNNOOPPSSIISS
).
我在寻找从手册页中删除额外格式的方法时偶然发现了这篇文章。ansifilter做到了,但是它远远没有达到预期的结果(例如,所有以前粗体的字符都被复制了,比如SSYYNNOOPPSSIISS)。
For that task the correct command would be col -bx
, for example:
对于该任务,正确的命令是col -bx,例如:
groff -man -Tascii fopen.3 | col -bx > fopen.3.txt
(源)
#6
2
I built vtclean for this. It strips escape sequences using these regular expressions in order (explained in regex.txt):
我为这个做了vtclean。它使用这些正则表达式依次提取转义序列(在regex.txt中解释):
// handles long-form RGB codes
^\033](\d+);([^\033]+)\033\\
// excludes non-movement/color codes
^\033(\[[^a-zA-Z0-9@\?]+|[\(\)]).
// parses movement and color codes
^\033([\[\]]([\d\?]+)?(;[\d\?]+)*)?(.)`)
It additionally does basic line-edit emulation, so backspace and other movement characters (like left arrow key) are parsed.
它还进行基本的行编辑模拟,因此将解析backspace和其他移动字符(如左箭头键)。
#7
1
Just a note; let's say you have a file like this (such line endings are generated by git
remote reports):
只是一份报告;假设您有一个这样的文件(这样的行结尾是由git remote reports生成的):
echo -e "remote: * 27625a8 (HEAD, master) 1st git commit\x1b[K
remote: \x1b[K
remote: \x1b[K
remote: \x1b[K
remote: \x1b[K
remote: \x1b[K
remote: Current branch master is up to date.\x1b[K" > chartest.txt
In binary, this looks like this:
在二进制中,它是这样的:
$ cat chartest.txt | hexdump -C
00000000 72 65 6d 6f 74 65 3a 20 2a 20 32 37 36 32 35 61 |remote: * 27625a|
00000010 38 20 28 48 45 41 44 2c 20 6d 61 73 74 65 72 29 |8 (HEAD, master)|
00000020 20 31 73 74 20 67 69 74 20 63 6f 6d 6d 69 74 1b | 1st git commit.|
00000030 5b 4b 0a 72 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 |[K.remote: .[K.r|
00000040 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 65 6d 6f 74 |emote: .[K.remot|
00000050 65 3a 20 1b 5b 4b 0a 72 65 6d 6f 74 65 3a 20 1b |e: .[K.remote: .|
00000060 5b 4b 0a 72 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 |[K.remote: .[K.r|
00000070 65 6d 6f 74 65 3a 20 43 75 72 72 65 6e 74 20 62 |emote: Current b|
00000080 72 61 6e 63 68 20 6d 61 73 74 65 72 20 69 73 20 |ranch master is |
00000090 75 70 20 74 6f 20 64 61 74 65 2e 1b 5b 4b 0a |up to date..[K.|
0000009f
It is visible that git
here adds the sequence 0x1b
0x5b
0x4b
before the line ending (0x0a
).
可以看到,git在行结束(0x0a)之前添加了序列0x1b 0x5b 0x4b。
Note that - while you can match the 0x1b
with a literal format \x1b
in sed, you CANNOT do the same for 0x5b
, which represents the left square bracket [
:
注意——虽然可以将0x1b与sed中的文字格式\x1b匹配,但是不能对0x5b进行相同的操作,0x5b表示左方括号[:
$ cat chartest.txt | sed 's/\x1b\x5b//g' | hexdump -C
sed: -e expression #1, char 13: Invalid regular expression
You might think you can escape the representation with an extra backslash \
- which ends up as \\x5b
; but while that "passes" - it doesn't match anything as intended:
您可能认为您可以使用一个额外的反斜杠\—以\x5b结尾;但是,尽管它“通过”了——它并不符合任何目标:
$ cat chartest.txt | sed 's/\x1b\\x5b//g' | hexdump -C
00000000 72 65 6d 6f 74 65 3a 20 2a 20 32 37 36 32 35 61 |remote: * 27625a|
00000010 38 20 28 48 45 41 44 2c 20 6d 61 73 74 65 72 29 |8 (HEAD, master)|
00000020 20 31 73 74 20 67 69 74 20 63 6f 6d 6d 69 74 1b | 1st git commit.|
00000030 5b 4b 0a 72 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 |[K.remote: .[K.r|
00000040 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 65 6d 6f 74 |emote: .[K.remot|
...
So if you want to match this character, apparently you must write it as escaped left square bracket, that is \[
- the rest of the values can than be entered with escaped \x
notation:
因此,如果你想要匹配这个字符,显然你必须把它写成左方括号,那就是\[-其余的值都可以用逃脱的\x符号来输入:
$ cat chartest.txt | sed 's/\x1b\[\x4b//g' | hexdump -C
00000000 72 65 6d 6f 74 65 3a 20 2a 20 32 37 36 32 35 61 |remote: * 27625a|
00000010 38 20 28 48 45 41 44 2c 20 6d 61 73 74 65 72 29 |8 (HEAD, master)|
00000020 20 31 73 74 20 67 69 74 20 63 6f 6d 6d 69 74 0a | 1st git commit.|
00000030 72 65 6d 6f 74 65 3a 20 0a 72 65 6d 6f 74 65 3a |remote: .remote:|
00000040 20 0a 72 65 6d 6f 74 65 3a 20 0a 72 65 6d 6f 74 | .remote: .remot|
00000050 65 3a 20 0a 72 65 6d 6f 74 65 3a 20 0a 72 65 6d |e: .remote: .rem|
00000060 6f 74 65 3a 20 43 75 72 72 65 6e 74 20 62 72 61 |ote: Current bra|
00000070 6e 63 68 20 6d 61 73 74 65 72 20 69 73 20 75 70 |nch master is up|
00000080 20 74 6f 20 64 61 74 65 2e 0a | to date..|
0000008a
#8
0
I don't have enough reputation to add a comment to the answer given by Luke H, but I did want to share the regular expression that I've been using to eliminate all of the ASCII Escape Sequences.
我没有足够的声誉给Luke H给出的答案添加注释,但是我确实想分享我用来消除所有ASCII转义序列的正则表达式。
sed -r 's~\x01?(\x1B\(B)?\x1B\[([0-9;]*)?[JKmsu]\x02?~~g'
#1
33
Are you looking for ansifilter?
你在找ansifilter吗?
Two things you can do: enter the literal escape (in bash:)
您可以做两件事:输入文字转义符(在bash:)
Using keyboard entry:
使用键盘输入:
sed 's/Ctrl-vEsc//g'
sed ' s / Ctrl-vEsc / / g '
alternatively
另外
sed 's/Ctrl-vCtrl-[//g'
sed ' s / Ctrl-vCtrl -[/ / g '
Or you can use character escapes:
或者你可以使用字符转义:
sed 's/\x1b//g'
or for all control characters:
或所有控制字符:
sed 's/[\x01-\x1F\x7F]//g' # NOTE: zaps TAB character too!
#2
15
I managed with the following for my purposes, but this doesn't include all possible ANSI escapes:
出于我的目的,我使用了以下方法,但这并不包括所有可能的ANSI转义:
sed -r s/\x1b\[[0-9;]*m?//g
This removes m
commands, but for all escapes (as commented by @lethalman) use:
这将删除m个命令,但是对于所有转义(如@lethalman评论)使用:
sed -r s/\x1b\[[^@-~]*[@-~]//g
Also see "Python regex to match VT100 escape sequences".
请参见“匹配VT100转义序列的Python regex”。
There is also a table of common escape sequences.
还有一个常见的转义序列表。
#3
8
ansi2txt command (part of kbtin package) seems to be doing the job perfectly on Ubuntu.
ansi2txt命令(kbtin包的一部分)似乎在Ubuntu上做得很好。
#4
6
commandlinefu gives the correct answer which strips ANSI colours as well as movement commands:
commandlinefu给出了正确的答案,它带了ANSI的颜色和移动的命令:
sed "s,\x1B\[[0-9;]*[a-zA-Z],,g"
#5
4
I've stumbled upon this post when looking for a way to strip extra formatting from man pages. ansifilter did it, but it was far from desired result (for example all previously-bold characters were duplicated, like SSYYNNOOPPSSIISS
).
我在寻找从手册页中删除额外格式的方法时偶然发现了这篇文章。ansifilter做到了,但是它远远没有达到预期的结果(例如,所有以前粗体的字符都被复制了,比如SSYYNNOOPPSSIISS)。
For that task the correct command would be col -bx
, for example:
对于该任务,正确的命令是col -bx,例如:
groff -man -Tascii fopen.3 | col -bx > fopen.3.txt
(源)
#6
2
I built vtclean for this. It strips escape sequences using these regular expressions in order (explained in regex.txt):
我为这个做了vtclean。它使用这些正则表达式依次提取转义序列(在regex.txt中解释):
// handles long-form RGB codes
^\033](\d+);([^\033]+)\033\\
// excludes non-movement/color codes
^\033(\[[^a-zA-Z0-9@\?]+|[\(\)]).
// parses movement and color codes
^\033([\[\]]([\d\?]+)?(;[\d\?]+)*)?(.)`)
It additionally does basic line-edit emulation, so backspace and other movement characters (like left arrow key) are parsed.
它还进行基本的行编辑模拟,因此将解析backspace和其他移动字符(如左箭头键)。
#7
1
Just a note; let's say you have a file like this (such line endings are generated by git
remote reports):
只是一份报告;假设您有一个这样的文件(这样的行结尾是由git remote reports生成的):
echo -e "remote: * 27625a8 (HEAD, master) 1st git commit\x1b[K
remote: \x1b[K
remote: \x1b[K
remote: \x1b[K
remote: \x1b[K
remote: \x1b[K
remote: Current branch master is up to date.\x1b[K" > chartest.txt
In binary, this looks like this:
在二进制中,它是这样的:
$ cat chartest.txt | hexdump -C
00000000 72 65 6d 6f 74 65 3a 20 2a 20 32 37 36 32 35 61 |remote: * 27625a|
00000010 38 20 28 48 45 41 44 2c 20 6d 61 73 74 65 72 29 |8 (HEAD, master)|
00000020 20 31 73 74 20 67 69 74 20 63 6f 6d 6d 69 74 1b | 1st git commit.|
00000030 5b 4b 0a 72 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 |[K.remote: .[K.r|
00000040 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 65 6d 6f 74 |emote: .[K.remot|
00000050 65 3a 20 1b 5b 4b 0a 72 65 6d 6f 74 65 3a 20 1b |e: .[K.remote: .|
00000060 5b 4b 0a 72 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 |[K.remote: .[K.r|
00000070 65 6d 6f 74 65 3a 20 43 75 72 72 65 6e 74 20 62 |emote: Current b|
00000080 72 61 6e 63 68 20 6d 61 73 74 65 72 20 69 73 20 |ranch master is |
00000090 75 70 20 74 6f 20 64 61 74 65 2e 1b 5b 4b 0a |up to date..[K.|
0000009f
It is visible that git
here adds the sequence 0x1b
0x5b
0x4b
before the line ending (0x0a
).
可以看到,git在行结束(0x0a)之前添加了序列0x1b 0x5b 0x4b。
Note that - while you can match the 0x1b
with a literal format \x1b
in sed, you CANNOT do the same for 0x5b
, which represents the left square bracket [
:
注意——虽然可以将0x1b与sed中的文字格式\x1b匹配,但是不能对0x5b进行相同的操作,0x5b表示左方括号[:
$ cat chartest.txt | sed 's/\x1b\x5b//g' | hexdump -C
sed: -e expression #1, char 13: Invalid regular expression
You might think you can escape the representation with an extra backslash \
- which ends up as \\x5b
; but while that "passes" - it doesn't match anything as intended:
您可能认为您可以使用一个额外的反斜杠\—以\x5b结尾;但是,尽管它“通过”了——它并不符合任何目标:
$ cat chartest.txt | sed 's/\x1b\\x5b//g' | hexdump -C
00000000 72 65 6d 6f 74 65 3a 20 2a 20 32 37 36 32 35 61 |remote: * 27625a|
00000010 38 20 28 48 45 41 44 2c 20 6d 61 73 74 65 72 29 |8 (HEAD, master)|
00000020 20 31 73 74 20 67 69 74 20 63 6f 6d 6d 69 74 1b | 1st git commit.|
00000030 5b 4b 0a 72 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 |[K.remote: .[K.r|
00000040 65 6d 6f 74 65 3a 20 1b 5b 4b 0a 72 65 6d 6f 74 |emote: .[K.remot|
...
So if you want to match this character, apparently you must write it as escaped left square bracket, that is \[
- the rest of the values can than be entered with escaped \x
notation:
因此,如果你想要匹配这个字符,显然你必须把它写成左方括号,那就是\[-其余的值都可以用逃脱的\x符号来输入:
$ cat chartest.txt | sed 's/\x1b\[\x4b//g' | hexdump -C
00000000 72 65 6d 6f 74 65 3a 20 2a 20 32 37 36 32 35 61 |remote: * 27625a|
00000010 38 20 28 48 45 41 44 2c 20 6d 61 73 74 65 72 29 |8 (HEAD, master)|
00000020 20 31 73 74 20 67 69 74 20 63 6f 6d 6d 69 74 0a | 1st git commit.|
00000030 72 65 6d 6f 74 65 3a 20 0a 72 65 6d 6f 74 65 3a |remote: .remote:|
00000040 20 0a 72 65 6d 6f 74 65 3a 20 0a 72 65 6d 6f 74 | .remote: .remot|
00000050 65 3a 20 0a 72 65 6d 6f 74 65 3a 20 0a 72 65 6d |e: .remote: .rem|
00000060 6f 74 65 3a 20 43 75 72 72 65 6e 74 20 62 72 61 |ote: Current bra|
00000070 6e 63 68 20 6d 61 73 74 65 72 20 69 73 20 75 70 |nch master is up|
00000080 20 74 6f 20 64 61 74 65 2e 0a | to date..|
0000008a
#8
0
I don't have enough reputation to add a comment to the answer given by Luke H, but I did want to share the regular expression that I've been using to eliminate all of the ASCII Escape Sequences.
我没有足够的声誉给Luke H给出的答案添加注释,但是我确实想分享我用来消除所有ASCII转义序列的正则表达式。
sed -r 's~\x01?(\x1B\(B)?\x1B\[([0-9;]*)?[JKmsu]\x02?~~g'