I'm making a Python program that will parse the fields in some input lines. I'd like to let the user enter the field separator as an option from the command line. I'm using optparse
to do this. I'm running into the problem that entering something like \t
will separate literally on \t
, rather than on a tab, which is what I want. I'm pretty sure this is a Python thing and not the shell, since I've tried every combo of quotes, backslashes, and t
's that I can think of.
我正在编写一个Python程序,它将解析一些输入行中的字段。我想让用户作为命令行中的选项输入字段分隔符。我用optparse来做这个。我遇到了这样的问题:输入像\t这样的东西会在\t上分开,而不是在tab上,这正是我想要的。我很确定这是Python的东西,而不是shell,因为我已经尝试了所有我能想到的引号、反斜杠和t的组合。
If I could get optparse
to let the argument be plain input (is there such a thing?) rather than raw_input
, I think that would work. But I have no clue how to do that.
如果我可以让optparse让参数为纯输入(有这样的东西吗?)而不是raw_input,我想那就可以了。但是我不知道怎么做。
I've also tried various substitutions and regex tricks to turn the string from the two character "\t"
into the one character tab, but without success.
我还尝试了各种替换和regex技巧,将字符串从两个字符“\t”转换为一个字符标签,但没有成功。
Example, where input.txt
is:
示例,输入。三是:
field 1[tab]field\t2
字段1 \[tab]字段t2
(Note: [tab]
is a tab character and field\t2
is an 8 character string)
(注:[tab]为制表符,字段t2为8个字符)
parseme.py:
parseme.py:
#!/usr/bin/python
from optparse import OptionParser
parser = OptionParser()
parser.add_option("-d", "--delimiter", action="store", type="string",
dest="delimiter", default='\t')
parser.add_option("-f", dest="filename")
(options, args) = parser.parse_args()
Infile = open(options.filename, 'r')
Line = Infile.readline()
Fields = Line.split(options.delimiter)
print Fields[0]
print options.delimiter
Infile.close()
This gives me:
这给我:
$ parseme.py -f input.txt
field 1
[tab]
Hey, great, the default setting worked properly. (Yes, I know I could just make \t the default and forget about it, but I'd like to know how to deal with this type of problem.)
很好,默认设置工作正常。(是的,我知道我可以把\t设为默认值,然后忘掉它,但我想知道如何处理这种类型的问题。)
$ parseme.py -f input.txt -d '\t'
field 1[tab]field
\t
This is not what I want.
这不是我想要的。
4 个解决方案
#1
7
>>> r'\t\n\v\r'.decode('string-escape')
'\t\n\x0b\r'
#2
0
The quick and dirty way is to to eval
it, like this:
快速而肮脏的方法是对它进行评估,就像这样:
eval(options.delimiter, {}. {})
The extra empty dicts are there to prevent accidental clobbering of your program.
额外的空白处是为了防止程序意外崩溃。
#3
0
solving it from within your script:
从你的脚本中解决它:
options.delimiter = re.sub("\\\\t","\t",options.delimiter)
you can adapt the re about to match more escaped chars (\n, \r, etc)
您可以调整re以匹配更多转义字符(\n、\r等)
another way to solve the problem outside python:
解决python之外问题的另一种方法:
when you call your script from shell, do it like this:
当您从shell调用脚本时,请这样做:
parseme.py -f input.txt -d '^V<tab>'
^V means "press Ctrl+V"
^ V的意思是“按Ctrl + V”
then press the normal tab key
然后按下普通的tab键
this will properly pass the tab character to your python script;
这将正确地将选项卡字符传递给python脚本;
#4
0
The callback
option is a good way to handle tricky cases:
回调选项是处理棘手情况的好方法:
parser.add_option("-d", "--delimiter", action="callback", type="string",
callback=my_callback, default='\t')
with the corresponding function (to be defined before the parser, then):
与相应的函数(在解析器之前定义):
def my_callback(option, opt, value, parser):
val = value
if value == '\\t':
val = '\t'
elif value == '\\n':
val = '\n'
parser.values.delimiter = val
You can check this works via the command line: python test.py -f test.txt -d \t
(no quote around the \t
, they're useless).
您可以通过命令行:python测试来检查这个工作。py - f检验。txt -d \t(没有引号,它们是没用的)
It has the advantage of handling the option via the 'optparse' module, not via post-processing the parsing results.
它的优点是通过“optparse”模块处理选项,而不是通过解析结果的后处理。
#1
7
>>> r'\t\n\v\r'.decode('string-escape')
'\t\n\x0b\r'
#2
0
The quick and dirty way is to to eval
it, like this:
快速而肮脏的方法是对它进行评估,就像这样:
eval(options.delimiter, {}. {})
The extra empty dicts are there to prevent accidental clobbering of your program.
额外的空白处是为了防止程序意外崩溃。
#3
0
solving it from within your script:
从你的脚本中解决它:
options.delimiter = re.sub("\\\\t","\t",options.delimiter)
you can adapt the re about to match more escaped chars (\n, \r, etc)
您可以调整re以匹配更多转义字符(\n、\r等)
another way to solve the problem outside python:
解决python之外问题的另一种方法:
when you call your script from shell, do it like this:
当您从shell调用脚本时,请这样做:
parseme.py -f input.txt -d '^V<tab>'
^V means "press Ctrl+V"
^ V的意思是“按Ctrl + V”
then press the normal tab key
然后按下普通的tab键
this will properly pass the tab character to your python script;
这将正确地将选项卡字符传递给python脚本;
#4
0
The callback
option is a good way to handle tricky cases:
回调选项是处理棘手情况的好方法:
parser.add_option("-d", "--delimiter", action="callback", type="string",
callback=my_callback, default='\t')
with the corresponding function (to be defined before the parser, then):
与相应的函数(在解析器之前定义):
def my_callback(option, opt, value, parser):
val = value
if value == '\\t':
val = '\t'
elif value == '\\n':
val = '\n'
parser.values.delimiter = val
You can check this works via the command line: python test.py -f test.txt -d \t
(no quote around the \t
, they're useless).
您可以通过命令行:python测试来检查这个工作。py - f检验。txt -d \t(没有引号,它们是没用的)
It has the advantage of handling the option via the 'optparse' module, not via post-processing the parsing results.
它的优点是通过“optparse”模块处理选项,而不是通过解析结果的后处理。