
时间:2021-02-25 09:11:03

Do you know any way that I could programmatically or via scrirpt transform a set of text files saved in ansi character encoding, to unicode encoding?


I would like to do the same as I do when I open the file with notepad and choose to save it as an unicode file.


6 个解决方案


You can use iconv. On Windows you can use it under Cygwin.


iconv -f from_encoding -t to_encoding file


This could work for you, but notice that it'll grab every file in the current folder:


Get-ChildItem | Foreach-Object { $c = (Get-Content $_); `
Set-Content -Encoding UTF8 $c -Path ($_.name + "u") }

Same thing using aliases for brevity:


gci | %{ $c = (gc $_); sc -Encoding UTF8 $c -Path ($_.name + "u") }

Steven Murawski suggests using Out-File instead. The differences between both cmdlets are the following:

Steven Murawski建议使用Out-File。两个cmdlet之间的差异如下:

  • Out-File will attempt to format the input it receives.
  • Out-File将尝试格式化它接收的输入。

  • Out-File's default encoding is Unicode-based, whereas Set-Content uses the system's default.
  • Out-File的默认编码是基于Unicode的,而Set-Content使用系统的默认编码。

Here's an example assuming the file test.txt doesn't exist in either case:


PS> [system.string] | Out-File test.txt
PS> Get-Content test.txt

IsPublic IsSerial Name                                     BaseType          
-------- -------- ----                                     --------          
True     True     String                                   System.Object     

# test.txt encoding is Unicode-based with BOM

PS> [system.string] | Set-Content test.txt
PS> Get-Content test.txt


# test.txt encoding is "ANSI" (Windows character set)

In fact, if you don't need any specific Unicode encoding, you could as well do the following to convert a text file to Unicode:


PS> Get-Content sourceASCII.txt > targetUnicode.txt

Out-File is a "redirection operator with optional parameters" of sorts.



The easiest way would be Get-Content 'path/to/text/file' | out-file 'name/of/file'.

最简单的方法是Get-Content'path / to / text / file'| out-file'name / of / file'。

Out-File has an -encoding parameter, the default of which is Unicode.


If you wanted to script a batch of them, you could do something like


$files = get-childitem 'directory/of/text/files' 
foreach ($file in $files) 
  get-content $file | out-file $file.fullname


Use the System.IO.StreamReader(To read the file contents) class together with the System.Text.Encoding.Encoding(To create the Encoder object which does the encoding) base class.



You could create a new text file and write the bytes from the original file into the new one, placing a '\0' before each original byte (assuming the original text file was in English).

您可以创建一个新的文本文件,并将原始文件中的字节写入新文件,在每个原始字节前放置一个'\ 0'(假设原始文本文件是英文)。


pseudo code...

Dim system, file, contents, newFile, oldFile


Const ForReading = 1, ForWriting = 2, ForAppending = 3 Const AnsiFile = -2, UnicodeFile = -1

Const ForReading = 1,ForWriting = 2,ForAppending = 3 Const AnsiFile = -2,UnicodeFile = -1

Set system = CreateObject("Scripting.FileSystemObject...

设置system = CreateObject(“Scripting.FileSystemObject ...

Set file = system.GetFile("text1.txt")

设置file = system.GetFile(“text1.txt”)

Set oldFile = file.OpenAsTextStream(ForReading, AnsiFile)

设置oldFile = file.OpenAsTextStream(ForReading,AnsiFile)

contents = oldFile.ReadAll()

contents = oldFile.ReadAll()


system.CreateTextFile "text1.txt"

Set file = system.GetFile("text1.txt")

设置file = system.GetFile(“text1.txt”)

Set newFile = file.OpenAsTextStream(ForWriting, UnicodeFile)

设置newFile = file.OpenAsTextStream(ForWriting,UnicodeFile)

newFile.Write contents


Hope this approach will work..



You can use iconv. On Windows you can use it under Cygwin.


iconv -f from_encoding -t to_encoding file


This could work for you, but notice that it'll grab every file in the current folder:


Get-ChildItem | Foreach-Object { $c = (Get-Content $_); `
Set-Content -Encoding UTF8 $c -Path ($_.name + "u") }

Same thing using aliases for brevity:


gci | %{ $c = (gc $_); sc -Encoding UTF8 $c -Path ($_.name + "u") }

Steven Murawski suggests using Out-File instead. The differences between both cmdlets are the following:

Steven Murawski建议使用Out-File。两个cmdlet之间的差异如下:

  • Out-File will attempt to format the input it receives.
  • Out-File将尝试格式化它接收的输入。

  • Out-File's default encoding is Unicode-based, whereas Set-Content uses the system's default.
  • Out-File的默认编码是基于Unicode的,而Set-Content使用系统的默认编码。

Here's an example assuming the file test.txt doesn't exist in either case:


PS> [system.string] | Out-File test.txt
PS> Get-Content test.txt

IsPublic IsSerial Name                                     BaseType          
-------- -------- ----                                     --------          
True     True     String                                   System.Object     

# test.txt encoding is Unicode-based with BOM

PS> [system.string] | Set-Content test.txt
PS> Get-Content test.txt


# test.txt encoding is "ANSI" (Windows character set)

In fact, if you don't need any specific Unicode encoding, you could as well do the following to convert a text file to Unicode:


PS> Get-Content sourceASCII.txt > targetUnicode.txt

Out-File is a "redirection operator with optional parameters" of sorts.



The easiest way would be Get-Content 'path/to/text/file' | out-file 'name/of/file'.

最简单的方法是Get-Content'path / to / text / file'| out-file'name / of / file'。

Out-File has an -encoding parameter, the default of which is Unicode.


If you wanted to script a batch of them, you could do something like


$files = get-childitem 'directory/of/text/files' 
foreach ($file in $files) 
  get-content $file | out-file $file.fullname


Use the System.IO.StreamReader(To read the file contents) class together with the System.Text.Encoding.Encoding(To create the Encoder object which does the encoding) base class.



You could create a new text file and write the bytes from the original file into the new one, placing a '\0' before each original byte (assuming the original text file was in English).

您可以创建一个新的文本文件,并将原始文件中的字节写入新文件,在每个原始字节前放置一个'\ 0'(假设原始文本文件是英文)。


pseudo code...

Dim system, file, contents, newFile, oldFile


Const ForReading = 1, ForWriting = 2, ForAppending = 3 Const AnsiFile = -2, UnicodeFile = -1

Const ForReading = 1,ForWriting = 2,ForAppending = 3 Const AnsiFile = -2,UnicodeFile = -1

Set system = CreateObject("Scripting.FileSystemObject...

设置system = CreateObject(“Scripting.FileSystemObject ...

Set file = system.GetFile("text1.txt")

设置file = system.GetFile(“text1.txt”)

Set oldFile = file.OpenAsTextStream(ForReading, AnsiFile)

设置oldFile = file.OpenAsTextStream(ForReading,AnsiFile)

contents = oldFile.ReadAll()

contents = oldFile.ReadAll()


system.CreateTextFile "text1.txt"

Set file = system.GetFile("text1.txt")

设置file = system.GetFile(“text1.txt”)

Set newFile = file.OpenAsTextStream(ForWriting, UnicodeFile)

设置newFile = file.OpenAsTextStream(ForWriting,UnicodeFile)

newFile.Write contents


Hope this approach will work..
