在文件名中解析Backward斜杠和正斜杠的混合

时间:2022-11-17 21:19:36

I am getting filename from an api in this format containing mix of / and \.

我从这种格式的api中获取文件名,其中包含/和\的混合。

infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'

infilename ='c:/ mydir1 / mydir2 \ mydir3 \ mydir4 \ 123xyz.csv'

When I try to parse the directory structure, \ followed by a character is converted into single character.

当我尝试解析目录结构时,\后跟一个字符转换为单个字符。

Is there a way around to get each component correctly?

有没有办法正确地获得每个组件?

What I already tried:

我已经尝试过的:

path.normpath didn't help. infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'os.path.normpath(infilename)out:'c:\\mydir1\\mydir2\\mydir3\\mydir4Sxyz.csv'

3 个解决方案

#1


1  

that's not visible in your example but writing this:

这在你的例子中是不可见的,但写下这个:

infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'

isn't a good idea because some of the lowercase (and a few uppercase) letters are interpreted as escape sequences if following an antislash. Notorious examples are \t, \b, there are others. For instance:

不是一个好主意,因为如果遵循反斜杠,一些小写(和一些大写)字母被解释为转义序列。臭名昭着的例子是\ t,\ b,还有其他的例子。例如:

infilename = 'c:/mydir1/mydir2\thedir3\bigdir4\123xyz.csv'

doubly fails because 2 chars are interpreted as "tab" and "backspace".

加倍失败,因为2个字符被解释为“tab”和“backspace”。

When dealing with literal Windows-style path (or regexes), you have to use the raw prefix, and better, normalize your path to get rid of the slashes.

处理文字的Windows样式路径(或正则表达式)时,必须使用原始前缀,更好地规范化路径以消除斜杠。

infilename = os.path.normpath(r'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv')

However, the raw prefix only applies to literals. If the returned string appears, when printing repr(string), as 'the\terrible\\dir', then tab chars have already been put in the string, and there's nothing you can do except a lousy post-processing.

但是,原始前缀仅适用于文字。如果返回的字符串出现,当打印repr(字符串)时,作为'\ terrible \\ dir',那么tab字符已经被放入字符串中,除了糟糕的后处理之外,你什么也做不了。

#2


0  

use r before the string to process it as a raw string (i.e. no string formatting).

在字符串之前使用r将其作为原始字符串处理(即没有字符串格式化)。

e.g.

infilename = r'C:/blah/blah/blah.csv'

More details here:https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals

更多细节:https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals

#3


0  

Instead of parsing by \ try parsing by \\. You usually have to escape by \ so the \ character is actually \\.

而不是通过\解析\尝试解析\\。你通常必须通过\转义,所以\字符实际上是\\。

#1


1  

that's not visible in your example but writing this:

这在你的例子中是不可见的,但写下这个:

infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'

isn't a good idea because some of the lowercase (and a few uppercase) letters are interpreted as escape sequences if following an antislash. Notorious examples are \t, \b, there are others. For instance:

不是一个好主意,因为如果遵循反斜杠,一些小写(和一些大写)字母被解释为转义序列。臭名昭着的例子是\ t,\ b,还有其他的例子。例如:

infilename = 'c:/mydir1/mydir2\thedir3\bigdir4\123xyz.csv'

doubly fails because 2 chars are interpreted as "tab" and "backspace".

加倍失败,因为2个字符被解释为“tab”和“backspace”。

When dealing with literal Windows-style path (or regexes), you have to use the raw prefix, and better, normalize your path to get rid of the slashes.

处理文字的Windows样式路径(或正则表达式)时,必须使用原始前缀,更好地规范化路径以消除斜杠。

infilename = os.path.normpath(r'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv')

However, the raw prefix only applies to literals. If the returned string appears, when printing repr(string), as 'the\terrible\\dir', then tab chars have already been put in the string, and there's nothing you can do except a lousy post-processing.

但是,原始前缀仅适用于文字。如果返回的字符串出现,当打印repr(字符串)时,作为'\ terrible \\ dir',那么tab字符已经被放入字符串中,除了糟糕的后处理之外,你什么也做不了。

#2


0  

use r before the string to process it as a raw string (i.e. no string formatting).

在字符串之前使用r将其作为原始字符串处理(即没有字符串格式化)。

e.g.

infilename = r'C:/blah/blah/blah.csv'

More details here:https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals

更多细节:https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals

#3


0  

Instead of parsing by \ try parsing by \\. You usually have to escape by \ so the \ character is actually \\.

而不是通过\解析\尝试解析\\。你通常必须通过\转义,所以\字符实际上是\\。