处理Windows命令中的扩展字符?

时间:2022-04-09 17:34:19

I am debugging a windows batch command file. It is failing when extended (> 0x7f) characters are used in the paths or file names. The problem seems to be related to passing parameters to a command file that is CALLed from another.

我正在调试一个Windows批处理命令文件。在路径或文件名中使用扩展(> 0x7f)字符时失败。该问题似乎与将参数传递给从另一个CALLed的命令文件有关。

For an example, this command works as expected:

例如,此命令按预期工作:

xcopy "Pezuñero\1 - 001.wav" \temp

This does not:

这不是:

call another.cmd "Pezuñero" 

Contents of "another.cmd":

“another.cmd”的内容:

xcopy "%~1\1 - 001.wav"    \temp

The %~1 syntax expands a parameter and removes quotes. This is necessary because in the real command file, the paths in either the calling or called command file may have spaces.

%~1语法扩展参数并删除引号。这是必要的,因为在实际命令文件中,调用或被调用命令文件中的路径可能有空格。

The result of the second example (copied from the CMD window) is this:

第二个例子的结果(从CMD窗口复制)是这样的:

C:\>call another.cmd "Pezu±ero"    

C:\>xcopy "Pezu±ero\1 - 001.wav"    \temp
File not found - 1 - 001.wav
0 File(s) copied

Note that the "ñ" (0xF1) character has been changed to a "±" (0xB1).

请注意,“ñ”(0xF1)字符已更改为“±”(0xB1)。

Can anyone explain what is going on, and how to work around this?

任何人都可以解释发生了什么,以及如何解决这个问题?

5 个解决方案

#1


The script must be written in the same encoding cmd.exe uses.

该脚本必须使用cmd.exe使用的相同编码编写。

Type chcp at the prompt and see what you get. Then open the file with an editor that supports that encoding. For me chcp outputs codepage 850, so I edit my script in JEdit selecting IBM850 as the file encoding. I get the same result editing the file in PSPad with Format set to OEM.

在提示符下键入chcp,看看你得到了什么。然后使用支持该编码的编辑器打开该文件。对我来说,chcp输出代码页850,所以我在JEdit中编辑我的脚本,选择IBM850作为文件编码。我得到相同的结果编辑文件在PSPad格式设置为OEM。

P.S.: I tested your steps in my machine and the ñ character that I write in notepad.exe (using the default ANSI encoding) is also converted to a ± when read from the command prompt, so it looks like your machine uses similar ANSI and OEM encodings. To be sure try replacing the ñ by a ¤ (with notepad.exe). That makes the script work correctly for me when run from the command prompt (because the byte value of the ANSI's ¤ is the same as the OEM's ñ).

PS:我在我的机器上测试了你的步骤,我在notepad.exe中写的ñ字符(使用默认的ANSI编码)也从命令提示符读取时转换为±,所以看起来你的机器使用类似的ANSI和OEM编码。一定要尝试用¤替换ñ(用notepad.exe)。这使得脚本在从命令提示符运行时正常工作(因为ANSI的字节值¤与OEM的相同)。

#2


Thanks to McDowell and Romulo for pointing me in the right direction. I realized I needed to change my application (in Delphi) that generates the batch so it uses the proper (OEM) code page that is compatible with the command processor in Windows. I didn't find anything to convert codepage strings, but I did discover the Windows API functions SetFileApisToOEM and SetFileApisToANSI;

感谢McDowell和Romulo指出我正确的方向。我意识到我需要更改生成批处理的应用程序(在Delphi中),因此它使用与Windows中的命令处理器兼容的正确(OEM)代码页。我没有找到任何转换代码页字符串的内容,但我确实发现了Windows API函数SetFileApisToOEM和SetFileApisToANSI;

I put these at the beginning and end of my program, like this:

我将这些放在程序的开头和结尾,如下所示:

{main procedure}
begin
  SetFileApisToOEM;
  {all the rest of the program}
  SetFileApisToANSI;
end.

Now the batch files are generated with the OEM code page, and they work properly when run from a CMD prompt.

现在批处理文件是使用OEM代码页生成的,并且从CMD提示符运行时它们可以正常工作。

#3


I've been looking at character handling in cmd.exe and I think Romulo has hit the nail on the head. By default, the prompt uses old DOS (OEM) code pages (probably for compatibility with DOS programs). You are writing your file using (probably) the default Windows code page (likely 1252), which is different. Use edit.com to edit the batch file.

我一直在看cmd.exe中的字符处理,我认为Romulo已经击中了头部。默认情况下,提示使用旧的DOS(OEM)代码页(可能与DOS程序兼容)。您正在使用(可能)默认的Windows代码页(可能是1252)编写文件,这是不同的。使用edit.com编辑批处理文件。

If I type chcp at the prompt, it reports the code page 850.

如果我在提示符下键入chcp,它将报告代码页850。

So, for example, if I use Notepad to type this:

所以,例如,如果我使用记事本键入:

DIR Pezuñero

...this is encoded as 1252 with the binary values:

...使用二进制值编码为1252:

                        ñ
44 49 52 20 50 65 7A 75 F1 65 72 6F

If I use edit to write the file, it is encoded as 850 with the binary values:

如果我使用edit来编写文件,则使用二进制值将其编码为850:

                        ñ
44 49 52 20 50 65 7A 75 A4 65 72 6F

One thing I haven't looked at is using the cmd /U switch, but I'm pretty sure that is only for built in shell commands and won't help you with XCOPY.

我没有看过的一件事是使用cmd / U开关,但我很确定这只适用于内置的shell命令,并且不会帮助你使用XCOPY。

#4


Codepages are a problem in batch files as they are not allowed to contain Unicode. The easiest way to avoid this issue altogether would probably be to use WSH or Powershell. I haven't found a workaround for batch files so far which really bothers me as I consider myself a Unicode zealot :)

代码页是批处理文件中的问题,因为它们不允许包含Unicode。完全避免这个问题的最简单方法可能是使用WSH或Powershell。到目前为止,我还没有找到批处理文件的解决方法,这让我很困扰,因为我认为自己是一个Unicode*者:)

#5


You may need to set the codepage to one that has the n with ~ on top.

您可能需要将代码页设置为n位于顶部的n。

#1


The script must be written in the same encoding cmd.exe uses.

该脚本必须使用cmd.exe使用的相同编码编写。

Type chcp at the prompt and see what you get. Then open the file with an editor that supports that encoding. For me chcp outputs codepage 850, so I edit my script in JEdit selecting IBM850 as the file encoding. I get the same result editing the file in PSPad with Format set to OEM.

在提示符下键入chcp,看看你得到了什么。然后使用支持该编码的编辑器打开该文件。对我来说,chcp输出代码页850,所以我在JEdit中编辑我的脚本,选择IBM850作为文件编码。我得到相同的结果编辑文件在PSPad格式设置为OEM。

P.S.: I tested your steps in my machine and the ñ character that I write in notepad.exe (using the default ANSI encoding) is also converted to a ± when read from the command prompt, so it looks like your machine uses similar ANSI and OEM encodings. To be sure try replacing the ñ by a ¤ (with notepad.exe). That makes the script work correctly for me when run from the command prompt (because the byte value of the ANSI's ¤ is the same as the OEM's ñ).

PS:我在我的机器上测试了你的步骤,我在notepad.exe中写的ñ字符(使用默认的ANSI编码)也从命令提示符读取时转换为±,所以看起来你的机器使用类似的ANSI和OEM编码。一定要尝试用¤替换ñ(用notepad.exe)。这使得脚本在从命令提示符运行时正常工作(因为ANSI的字节值¤与OEM的相同)。

#2


Thanks to McDowell and Romulo for pointing me in the right direction. I realized I needed to change my application (in Delphi) that generates the batch so it uses the proper (OEM) code page that is compatible with the command processor in Windows. I didn't find anything to convert codepage strings, but I did discover the Windows API functions SetFileApisToOEM and SetFileApisToANSI;

感谢McDowell和Romulo指出我正确的方向。我意识到我需要更改生成批处理的应用程序(在Delphi中),因此它使用与Windows中的命令处理器兼容的正确(OEM)代码页。我没有找到任何转换代码页字符串的内容,但我确实发现了Windows API函数SetFileApisToOEM和SetFileApisToANSI;

I put these at the beginning and end of my program, like this:

我将这些放在程序的开头和结尾,如下所示:

{main procedure}
begin
  SetFileApisToOEM;
  {all the rest of the program}
  SetFileApisToANSI;
end.

Now the batch files are generated with the OEM code page, and they work properly when run from a CMD prompt.

现在批处理文件是使用OEM代码页生成的,并且从CMD提示符运行时它们可以正常工作。

#3


I've been looking at character handling in cmd.exe and I think Romulo has hit the nail on the head. By default, the prompt uses old DOS (OEM) code pages (probably for compatibility with DOS programs). You are writing your file using (probably) the default Windows code page (likely 1252), which is different. Use edit.com to edit the batch file.

我一直在看cmd.exe中的字符处理,我认为Romulo已经击中了头部。默认情况下,提示使用旧的DOS(OEM)代码页(可能与DOS程序兼容)。您正在使用(可能)默认的Windows代码页(可能是1252)编写文件,这是不同的。使用edit.com编辑批处理文件。

If I type chcp at the prompt, it reports the code page 850.

如果我在提示符下键入chcp,它将报告代码页850。

So, for example, if I use Notepad to type this:

所以,例如,如果我使用记事本键入:

DIR Pezuñero

...this is encoded as 1252 with the binary values:

...使用二进制值编码为1252:

                        ñ
44 49 52 20 50 65 7A 75 F1 65 72 6F

If I use edit to write the file, it is encoded as 850 with the binary values:

如果我使用edit来编写文件,则使用二进制值将其编码为850:

                        ñ
44 49 52 20 50 65 7A 75 A4 65 72 6F

One thing I haven't looked at is using the cmd /U switch, but I'm pretty sure that is only for built in shell commands and won't help you with XCOPY.

我没有看过的一件事是使用cmd / U开关,但我很确定这只适用于内置的shell命令,并且不会帮助你使用XCOPY。

#4


Codepages are a problem in batch files as they are not allowed to contain Unicode. The easiest way to avoid this issue altogether would probably be to use WSH or Powershell. I haven't found a workaround for batch files so far which really bothers me as I consider myself a Unicode zealot :)

代码页是批处理文件中的问题,因为它们不允许包含Unicode。完全避免这个问题的最简单方法可能是使用WSH或Powershell。到目前为止,我还没有找到批处理文件的解决方法,这让我很困扰,因为我认为自己是一个Unicode*者:)

#5


You may need to set the codepage to one that has the n with ~ on top.

您可能需要将代码页设置为n位于顶部的n。