I am getting filename from an api in this format containing mix of /
and \
.
我从这种格式的api中获取文件名,其中包含/和\的混合。
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
infilename ='c:/ mydir1 / mydir2 \ mydir3 \ mydir4 \ 123xyz.csv'
When I try to parse the directory structure, \
followed by a character is converted into single character.
当我尝试解析目录结构时,\后跟一个字符转换为单个字符。
Is there a way around to get each component correctly?
有没有办法正确地获得每个组件?
What I already tried:
我已经尝试过的:
path.normpath didn't help. infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'os.path.normpath(infilename)out:'c:\\mydir1\\mydir2\\mydir3\\mydir4Sxyz.csv'
3 个解决方案
#1
1
that's not visible in your example but writing this:
这在你的例子中是不可见的,但写下这个:
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
isn't a good idea because some of the lowercase (and a few uppercase) letters are interpreted as escape sequences if following an antislash. Notorious examples are \t
, \b
, there are others. For instance:
不是一个好主意,因为如果遵循反斜杠,一些小写(和一些大写)字母被解释为转义序列。臭名昭着的例子是\ t,\ b,还有其他的例子。例如:
infilename = 'c:/mydir1/mydir2\thedir3\bigdir4\123xyz.csv'
doubly fails because 2 chars are interpreted as "tab" and "backspace".
加倍失败,因为2个字符被解释为“tab”和“backspace”。
When dealing with literal Windows-style path (or regexes), you have to use the raw prefix, and better, normalize your path to get rid of the slashes.
处理文字的Windows样式路径(或正则表达式)时,必须使用原始前缀,更好地规范化路径以消除斜杠。
infilename = os.path.normpath(r'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv')
However, the raw prefix only applies to literals. If the returned string appears, when printing repr(string)
, as 'the\terrible\\dir'
, then tab chars have already been put in the string, and there's nothing you can do except a lousy post-processing.
但是,原始前缀仅适用于文字。如果返回的字符串出现,当打印repr(字符串)时,作为'\ terrible \\ dir',那么tab字符已经被放入字符串中,除了糟糕的后处理之外,你什么也做不了。
#2
0
use r before the string to process it as a raw string (i.e. no string formatting).
在字符串之前使用r将其作为原始字符串处理(即没有字符串格式化)。
e.g.
infilename = r'C:/blah/blah/blah.csv'
More details here:https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals
更多细节:https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals
#3
0
Instead of parsing by \
try parsing by \\
. You usually have to escape by \
so the \ character is actually \\
.
而不是通过\解析\尝试解析\\。你通常必须通过\转义,所以\字符实际上是\\。
#1
1
that's not visible in your example but writing this:
这在你的例子中是不可见的,但写下这个:
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
isn't a good idea because some of the lowercase (and a few uppercase) letters are interpreted as escape sequences if following an antislash. Notorious examples are \t
, \b
, there are others. For instance:
不是一个好主意,因为如果遵循反斜杠,一些小写(和一些大写)字母被解释为转义序列。臭名昭着的例子是\ t,\ b,还有其他的例子。例如:
infilename = 'c:/mydir1/mydir2\thedir3\bigdir4\123xyz.csv'
doubly fails because 2 chars are interpreted as "tab" and "backspace".
加倍失败,因为2个字符被解释为“tab”和“backspace”。
When dealing with literal Windows-style path (or regexes), you have to use the raw prefix, and better, normalize your path to get rid of the slashes.
处理文字的Windows样式路径(或正则表达式)时,必须使用原始前缀,更好地规范化路径以消除斜杠。
infilename = os.path.normpath(r'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv')
However, the raw prefix only applies to literals. If the returned string appears, when printing repr(string)
, as 'the\terrible\\dir'
, then tab chars have already been put in the string, and there's nothing you can do except a lousy post-processing.
但是,原始前缀仅适用于文字。如果返回的字符串出现,当打印repr(字符串)时,作为'\ terrible \\ dir',那么tab字符已经被放入字符串中,除了糟糕的后处理之外,你什么也做不了。
#2
0
use r before the string to process it as a raw string (i.e. no string formatting).
在字符串之前使用r将其作为原始字符串处理(即没有字符串格式化)。
e.g.
infilename = r'C:/blah/blah/blah.csv'
More details here:https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals
更多细节:https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals
#3
0
Instead of parsing by \
try parsing by \\
. You usually have to escape by \
so the \ character is actually \\
.
而不是通过\解析\尝试解析\\。你通常必须通过\转义,所以\字符实际上是\\。