How to convert a string that is in UCS2 (2 bytes per character) into a UTF8 string in Ruby?
如何将UCS2中的字符串(每个字符2个字节)转换为Ruby中的UTF8字符串?
3 个解决方案
#1
You should look into iconv, which is part of the Ruby standard library. It is designed for this task.
您应该查看iconv,它是Ruby标准库的一部分。它专为此任务而设计。
Specifically,
Iconv.iconv("utf-8", "utf-16", str).first
should handle the conversion.
应该处理转换。
#2
Because chars in most cases string in UCS2 encoding can be represented as UTF-16 string (in UTF-16 char with codes bigger than 0x10000 is rarely used) I think use of Iconv is better way to convert strings. Sample code:
因为大多数情况下字符串在UCS2编码中的字符串可以表示为UTF-16字符串(在代码大于0x10000的UTF-16字符中很少使用)我认为使用Iconv是转换字符串的更好方法。示例代码:
require 'iconv'
ic = Iconv.new 'UTF-8', 'UTF-16'
utf8string = ic.iconv ucs2string
#3
With Ruby 1.9:
使用Ruby 1.9:
string.encode("utf-8")
If the string encoding is not known, you may need to set it first:
如果字符串编码未知,您可能需要先设置它:
string.force_encoding("utf-16be").encode("utf-8") # Big-endian
string.force_encoding("utf-16le").encode("utf-8") # Little-endian
#1
You should look into iconv, which is part of the Ruby standard library. It is designed for this task.
您应该查看iconv,它是Ruby标准库的一部分。它专为此任务而设计。
Specifically,
Iconv.iconv("utf-8", "utf-16", str).first
should handle the conversion.
应该处理转换。
#2
Because chars in most cases string in UCS2 encoding can be represented as UTF-16 string (in UTF-16 char with codes bigger than 0x10000 is rarely used) I think use of Iconv is better way to convert strings. Sample code:
因为大多数情况下字符串在UCS2编码中的字符串可以表示为UTF-16字符串(在代码大于0x10000的UTF-16字符中很少使用)我认为使用Iconv是转换字符串的更好方法。示例代码:
require 'iconv'
ic = Iconv.new 'UTF-8', 'UTF-16'
utf8string = ic.iconv ucs2string
#3
With Ruby 1.9:
使用Ruby 1.9:
string.encode("utf-8")
If the string encoding is not known, you may need to set it first:
如果字符串编码未知,您可能需要先设置它:
string.force_encoding("utf-16be").encode("utf-8") # Big-endian
string.force_encoding("utf-16le").encode("utf-8") # Little-endian