I want to replace control characters (ASCII 0-31) and spaces (ASCII 32) with hex escape codes. For example:
我想用十六进制转义码替换控制字符(ASCII 0-31)和空格(ASCII 32)。例如:
$ escape 'label=My Disc'
label=My\x20Disc
$ escape $'multi\nline\ttabbed string'
multi\x0Aline\x09tabbed\x20string
$ escape '\'
\\
For context, I'm writing a script which statuses a DVD drive. Its output is designed to be parsed by another program. My idea is to print each piece of info as a separate space-separated word. For example:
对于上下文,我正在编写一个用于状态DVD驱动器的脚本。它的输出设计为由另一个程序解析。我的想法是将每条信息打印成一个单独的空格分隔的单词。例如:
$ ./discStatus --monitor
/dev/dvd: no-disc
/dev/dvd: disc blank writable size=0 capacity=2015385600
/dev/dvd: disc not-blank not-writable size=2015385600 capacity=2015385600
I want to add the disc's label to this output. To fit with the parsing scheme I need to escape spaces and newlines. I might as well do all the other control characters as well.
我想将光盘的标签添加到此输出中。为了适应解析方案,我需要转义空格和换行符。我也可以做所有其他控制角色。
I'd prefer to stick to bash, sed, awk, tr, etc., if possible. I can't think of a really elegant way to do this with those tools, though. I'm willing to use perl or python if there's no good solution with basic shell constructs and tools.
如果可能的话,我宁愿坚持使用bash,sed,awk,tr等。但是,我想不出用这些工具做到这一点的非常优雅的方法。如果基本的shell构造和工具没有很好的解决方案,我愿意使用perl或python。
3 个解决方案
#1
2
Here's a Perl one-liner I came up with. It uses /e
to run code in the replacements.
这是我提出的Perl单线程。它使用/ e在替换中运行代码。
perl -pe 's/([\x00-\x20\\])/sprintf("\\x%02X", ord($1))/eg'
A slight deviation from the example in my question: it emits \x5C
for backslashes instead of \\
.
与我的问题中的示例稍有不同:它为反斜杠而不是\\发出\ x5C。
#2
0
I would use a higher-level language. There are three different types of replacement going on (single character to multicharacter for the control characters and space, identity for other printable characters, and the special case of doubling the backslash), which I think is too much for awk
, sed
, and the like to handle simply.
我会用更高级的语言。有三种不同类型的替换(控制字符和空格的单字符到多字符,其他可打印字符的标识,以及反斜杠加倍的特殊情况),我认为这对于awk,sed和喜欢简单地处理。
Here's my approach for Python
这是我的Python方法
def translate(c):
cp = ord(c)
if cp in range(33):
return '\\x%02x'%(cp,)
elif c == '\\':
return r'\\'
else:
return c
if __name__ == '__main__':
import sys
print ''.join( map(translate, sys.argv[1]) )
If speed is a concern, you can replace the translate function with a prebuilt dictionary mapping each character to its desired string representation.
如果考虑速度,可以将translate函数替换为预先构建的字典,将每个字符映射到所需的字符串表示。
#3
-1
Wow, it looks like a fairly trivial sed script along the lines of 's|\n|\\n|'
for each character you want to substitute.
哇,它看起来像是一个相当简单的sed脚本's | \ n | \\ n |'对于您想要替换的每个角色。
#1
2
Here's a Perl one-liner I came up with. It uses /e
to run code in the replacements.
这是我提出的Perl单线程。它使用/ e在替换中运行代码。
perl -pe 's/([\x00-\x20\\])/sprintf("\\x%02X", ord($1))/eg'
A slight deviation from the example in my question: it emits \x5C
for backslashes instead of \\
.
与我的问题中的示例稍有不同:它为反斜杠而不是\\发出\ x5C。
#2
0
I would use a higher-level language. There are three different types of replacement going on (single character to multicharacter for the control characters and space, identity for other printable characters, and the special case of doubling the backslash), which I think is too much for awk
, sed
, and the like to handle simply.
我会用更高级的语言。有三种不同类型的替换(控制字符和空格的单字符到多字符,其他可打印字符的标识,以及反斜杠加倍的特殊情况),我认为这对于awk,sed和喜欢简单地处理。
Here's my approach for Python
这是我的Python方法
def translate(c):
cp = ord(c)
if cp in range(33):
return '\\x%02x'%(cp,)
elif c == '\\':
return r'\\'
else:
return c
if __name__ == '__main__':
import sys
print ''.join( map(translate, sys.argv[1]) )
If speed is a concern, you can replace the translate function with a prebuilt dictionary mapping each character to its desired string representation.
如果考虑速度,可以将translate函数替换为预先构建的字典,将每个字符映射到所需的字符串表示。
#3
-1
Wow, it looks like a fairly trivial sed script along the lines of 's|\n|\\n|'
for each character you want to substitute.
哇,它看起来像是一个相当简单的sed脚本's | \ n | \\ n |'对于您想要替换的每个角色。