8个元素的Byte数组的内容为:E9 A2 91 E9 81 93 31 00
已知该数组存放的文字应是“频道1”
我使用网上的方法,转换后,显示出来的是“棰戦亾1”
如果把“棰戦亾1”这几个字用记事本存成htm文件,用IE打开后,点“查看”-“编码”-“Unicode(UTF-8)”后,网页内容会变成“频道1”!!
但我现在用VB怎么处理都显示成乱码,VB内置了Unicode,但是VB的控件却不支持!!!TextBox和Form2.0控件中的TextBox还有RichText均不能正常显示。
求VB算法,能把上面的数组转换成“频道1”,并显示在TextBox中,,换言之,就是把“频道1”转换成ANSI格式(C6 B5 B5 C0 31)见下面的“[color=#0000FF]测试C”,成功者,另奖100分! [/color]
hex(asc("频"))=C6B5
hex(ascw("频"))=9891
hex(asc("棰"))=E9A2
hex(ascw("棰"))=68F0
********************************
测试A:记事本中输入“频道1”,保存为UTF-8。 以16进制查看文件内容如下:
D:\>debug 1.txt
-d 100 la
0B35:0100 EF BB BF E9 A2 91 E9 81-93 31
-q
测试B:记事本中输入“频道1”,保存为ANSI。 以16进制查看文件内容如下:
D:\>debug 2.txt
-d 100 l5
0B35:0100 C6 B5 B5 C0 31
-q
测试C:记事本中输入“频道1”,保存为Unicode。 以16进制查看文件内容如下:
D:\>debug 3.txt
-d 100 la
0B35:0100 FF FE 91 98 53 90 31 00-93 31
-q
测试D:直接把数组内容的8个字节写入txt文件,用记事本打开后显示为“频道1”,按一次保存后会自动存为UTF-8
*************************************
一些VB转码参考资料:
http://topic.csdn.net/t/20030805/11/2110002.html
http://topic.csdn.net/t/20060421/21/4704931.html
http://topic.csdn.net/t/20060920/12/5035084.html
4 个解决方案
#1
Private Sub command1_click()
Me.Print (UTF8EncodeURI("频道1"))
End Sub
Function UTF8EncodeURI(szInput)
Dim wch, uch, szRet
Dim x
Dim nAsc, nAsc2, nAsc3
If szInput = "" Then
UTF8EncodeURI = szInput
Exit Function
End If
For x = 1 To Len(szInput)
wch = Mid(szInput, x, 1)
nAsc = AscW(wch)
If nAsc < 0 Then
nAsc = nAsc + 65536
End If
If (nAsc And &HFF80) = 0 Then
szRet = szRet & "&H" & Hex(Asc(wch)) & "&H00"
Else
If (nAsc And &HF000) = 0 Then
uch = "&H" & Hex(((nAsc \ 2 ^ 6)) Or &HC0) & Hex(nAsc And &H3F Or &H80)
szRet = szRet & uch
Else
uch = "&H" & Hex((nAsc \ 2 ^ 12) Or &HE0) & "&H" & _
Hex((nAsc \ 2 ^ 6) And &H3F Or &H80) & "&H" & _
Hex(nAsc And &H3F Or &H80)
szRet = szRet & uch
End If
End If
Next
UTF8EncodeURI = szRet
End Function
#2
'可恶,困扰一整天的问题,写完帖子竟发现答案就在我写的文字中。
'原来debug后,才发现数组是UTF-8的编码,知道编码就太好办了。
'此问题已解决。
Public Declare Function MultiByteToWideChar Lib "kernel32" Alias "MultiByteToWideChar" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As String, ByVal cchMultiByte As Long, ByVal lpWideCharStr As String, ByVal cchWideChar As Long) As Long
Public Const CP_UTF8 = 65001
Public Function UTF8_Decode(bUTF8() As Byte) As String
Dim lRet As Long
Dim lLen As Long
Dim lBufferSize As Long
Dim sBuffer As String
Dim bBuffer() As Byte
lLen = UBound(bUTF8) + 1
If lLen = 0 Then Exit Function
lBufferSize = lLen * 2
sBuffer = String$(lBufferSize, Chr(0))
lRet = MultiByteToWideChar(CP_UTF8, 0, VarPtr(bUTF8(0)), lLen, StrPtr(sBuffer), lBufferSize)
If lRet <> 0 Then
sBuffer = Left(sBuffer, lRet)
End If
UTF8_Decode = sBuffer
End Function
Private Sub Form_Load()
Dim a(6) As Byte, b As String
a(0) = &HE9
a(1) = &HA2
a(2) = &H91
a(3) = &HE9
a(4) = &H81
a(5) = &H93
a(6) = &H31
Text1.Text = UTF8_Decode(a())
End Sub
'原来debug后,才发现数组是UTF-8的编码,知道编码就太好办了。
'此问题已解决。
Public Declare Function MultiByteToWideChar Lib "kernel32" Alias "MultiByteToWideChar" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As String, ByVal cchMultiByte As Long, ByVal lpWideCharStr As String, ByVal cchWideChar As Long) As Long
Public Const CP_UTF8 = 65001
Public Function UTF8_Decode(bUTF8() As Byte) As String
Dim lRet As Long
Dim lLen As Long
Dim lBufferSize As Long
Dim sBuffer As String
Dim bBuffer() As Byte
lLen = UBound(bUTF8) + 1
If lLen = 0 Then Exit Function
lBufferSize = lLen * 2
sBuffer = String$(lBufferSize, Chr(0))
lRet = MultiByteToWideChar(CP_UTF8, 0, VarPtr(bUTF8(0)), lLen, StrPtr(sBuffer), lBufferSize)
If lRet <> 0 Then
sBuffer = Left(sBuffer, lRet)
End If
UTF8_Decode = sBuffer
End Function
Private Sub Form_Load()
Dim a(6) As Byte, b As String
a(0) = &HE9
a(1) = &HA2
a(2) = &H91
a(3) = &HE9
a(4) = &H81
a(5) = &H93
a(6) = &H31
Text1.Text = UTF8_Decode(a())
End Sub
#3
一般最简单的的用法:
MultiByteToWideChar( CP_ACP, 0&, 中文汉字地址, -1&, VB字符串地址, VB字符串长度 )
VB 中的字符串都是以 Unicode 存储的,唯一需要将字符串转换成Unicode的情况是字符串的来源是一个 DLL 或类似情况(或如从别的设备上发送的数据)。
下面是一个自己写的函数,传给它一个含有字符串的 Byte 数组,返回值是 VB 字符串:
'要用到 API, MultiByteToWideChar 和 lstrlenA
Declare Function MultiByteToWideChar Lib "kernel32" Alias "MultiByteToWideChar" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
Declare Function lstrlenA Lib "kernel32" (ByVal lpString As Long) As Long
unction ConvToUnicode(str() As Byte) As String
Const CP_ACP = 0&
Dim lenstr As Long
lenstr = lstrlenA(VarPtr(str(0))) + 2& '获得原始字符串长度
ConvToUnicode = String$(lenstr, vbNullChar) '分配足够大小的内存
lenstr = MultiByteToWideChar(CP_ACP, 0&, VarPtr(str(0)), -1&, StrPtr(ConvToUnicode), lenstr) '转换
ConvToUnicode = Left$(ConvToUnicode, lenstr - 1&) '去处多余字符
End Function
以下是 MultiByteToWideChar 函数的 MSDN 的说明:
The MultiByteToWideChar function maps a character string to a wide-character (Unicode) string. The character string mapped by this function is not necessarily from a multibyte character set.
int MultiByteToWideChar(
UINT CodePage, // code page
DWORD dwFlags, // character-type options
LPCSTR lpMultiByteStr, // address of string to map
int cchMultiByte, // number of characters in string
LPWSTR lpWideCharStr, // address of wide-character buffer
int cchWideChar // size of buffer
);
Parameters
CodePage
Specifies the code page to be used to perform the conversion. This parameter can be given the value of any codepage that is installed or available in the system. The following values may be used to specify one of the system default code pages:
Value Meaning
CP_ACP ANSI code page
CP_MACCP Macintosh code page
CP_OEMCP OEM code page
dwFlags
A set of bit flags that indicate whether to translate to precomposed or composite wide characters (if a composite form exists), whether to use glyph characters in place of control characters, and how to deal with invalid characters. You can specify a combination of the following flag constants:
Value Meaning
MB_PRECOMPOSED Always use precomposed characters ?that is, characters in which a base character and a nonspacing character have a single character value. This is the default translation option. Cannot be used with MB_COMPOSITE.
MB_COMPOSITE Always use composite characters ?that is, characters in which a base character and a nonspacing character have different character values. Cannot be used with MB_PRECOMPOSED.
MB_ERR_INVALID_CHARS If the function encounters an invalid input character, it fails and GetLastError returns ERROR_NO_UNICODE_TRANSLATION.
MB_USEGLYPHCHARS Use glyph characters instead of control characters.
A composite character consists of a base character and a nonspacing character, each having different character values. A precomposed character has a single character value for a base/non-spacing character combination. In the character ? the
e is the base character and the accent grave mark is the nonspacing character.
The function's default behavior is to translate to the precomposed form. If a precomposed form does not exist, the function attempts to translate to a composite form.
The flags MB_PRECOMPOSED and MB_COMPOSITE are mutually exclusive. The MB_USEGLYPHCHARS flag and the MB_ERR_INVALID_CHARS can be set regardless of the state of the other flags.
lpMultiByteStr
Points to the character string to be converted.
cchMultiByte
Specifies the size in bytes of the string pointed to by the lpMultiByteStr parameter. If this value is -1, the string is assumed to be null terminated and the length is calculated automatically.
lpWideCharStr
Points to a buffer that receives the translated string.
cchWideChar
Specifies the size, in wide characters, of the buffer pointed to by the lpWideCharStr parameter. If this value is zero, the function returns the required buffer size, in wide characters, and makes no use of the lpWideCharStr buffer.
Return Values
If the function succeeds, and cchWideChar is nonzero, the return value is the number of wide characters written to the buffer pointed to by lpWideCharStr.
If the function succeeds, and cchWideChar is zero, the return value is the required size, in wide characters, for a buffer that can receive the translated string.
If the function fails, the return value is zero. To get extended error information, call GetLastError. GetLastError may return one of the following error codes:
ERROR_INSUFFICIENT_BUFFER
ERROR_INVALID_FLAGS
ERROR_INVALID_PARAMETER
ERROR_NO_UNICODE_TRANSLATION
Remarks
The lpMultiByteStr and lpWideCharStr pointers must not be the same. If they are the same, the function fails, and GetLastError returns the value ERROR_INVALID_PARAMETER.
The function fails if MB_ERR_INVALID_CHARS is set and it encounters an invalid character in the source string. An invalid character is one that would translate to the default character if MB_ERR_INVALID_CHARS was not set, but is not the default character in the source string, or when a lead byte is found in a string and there is no valid trail byte for DBCS strings. When an invalid character is found, and MB_ERR_INVALID_CHARS is set, the function returns 0 and sets GetLastError with the error ERROR_NO_UNICODE_TRANSLATION.
MultiByteToWideChar( CP_ACP, 0&, 中文汉字地址, -1&, VB字符串地址, VB字符串长度 )
VB 中的字符串都是以 Unicode 存储的,唯一需要将字符串转换成Unicode的情况是字符串的来源是一个 DLL 或类似情况(或如从别的设备上发送的数据)。
下面是一个自己写的函数,传给它一个含有字符串的 Byte 数组,返回值是 VB 字符串:
'要用到 API, MultiByteToWideChar 和 lstrlenA
Declare Function MultiByteToWideChar Lib "kernel32" Alias "MultiByteToWideChar" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
Declare Function lstrlenA Lib "kernel32" (ByVal lpString As Long) As Long
unction ConvToUnicode(str() As Byte) As String
Const CP_ACP = 0&
Dim lenstr As Long
lenstr = lstrlenA(VarPtr(str(0))) + 2& '获得原始字符串长度
ConvToUnicode = String$(lenstr, vbNullChar) '分配足够大小的内存
lenstr = MultiByteToWideChar(CP_ACP, 0&, VarPtr(str(0)), -1&, StrPtr(ConvToUnicode), lenstr) '转换
ConvToUnicode = Left$(ConvToUnicode, lenstr - 1&) '去处多余字符
End Function
以下是 MultiByteToWideChar 函数的 MSDN 的说明:
The MultiByteToWideChar function maps a character string to a wide-character (Unicode) string. The character string mapped by this function is not necessarily from a multibyte character set.
int MultiByteToWideChar(
UINT CodePage, // code page
DWORD dwFlags, // character-type options
LPCSTR lpMultiByteStr, // address of string to map
int cchMultiByte, // number of characters in string
LPWSTR lpWideCharStr, // address of wide-character buffer
int cchWideChar // size of buffer
);
Parameters
CodePage
Specifies the code page to be used to perform the conversion. This parameter can be given the value of any codepage that is installed or available in the system. The following values may be used to specify one of the system default code pages:
Value Meaning
CP_ACP ANSI code page
CP_MACCP Macintosh code page
CP_OEMCP OEM code page
dwFlags
A set of bit flags that indicate whether to translate to precomposed or composite wide characters (if a composite form exists), whether to use glyph characters in place of control characters, and how to deal with invalid characters. You can specify a combination of the following flag constants:
Value Meaning
MB_PRECOMPOSED Always use precomposed characters ?that is, characters in which a base character and a nonspacing character have a single character value. This is the default translation option. Cannot be used with MB_COMPOSITE.
MB_COMPOSITE Always use composite characters ?that is, characters in which a base character and a nonspacing character have different character values. Cannot be used with MB_PRECOMPOSED.
MB_ERR_INVALID_CHARS If the function encounters an invalid input character, it fails and GetLastError returns ERROR_NO_UNICODE_TRANSLATION.
MB_USEGLYPHCHARS Use glyph characters instead of control characters.
A composite character consists of a base character and a nonspacing character, each having different character values. A precomposed character has a single character value for a base/non-spacing character combination. In the character ? the
e is the base character and the accent grave mark is the nonspacing character.
The function's default behavior is to translate to the precomposed form. If a precomposed form does not exist, the function attempts to translate to a composite form.
The flags MB_PRECOMPOSED and MB_COMPOSITE are mutually exclusive. The MB_USEGLYPHCHARS flag and the MB_ERR_INVALID_CHARS can be set regardless of the state of the other flags.
lpMultiByteStr
Points to the character string to be converted.
cchMultiByte
Specifies the size in bytes of the string pointed to by the lpMultiByteStr parameter. If this value is -1, the string is assumed to be null terminated and the length is calculated automatically.
lpWideCharStr
Points to a buffer that receives the translated string.
cchWideChar
Specifies the size, in wide characters, of the buffer pointed to by the lpWideCharStr parameter. If this value is zero, the function returns the required buffer size, in wide characters, and makes no use of the lpWideCharStr buffer.
Return Values
If the function succeeds, and cchWideChar is nonzero, the return value is the number of wide characters written to the buffer pointed to by lpWideCharStr.
If the function succeeds, and cchWideChar is zero, the return value is the required size, in wide characters, for a buffer that can receive the translated string.
If the function fails, the return value is zero. To get extended error information, call GetLastError. GetLastError may return one of the following error codes:
ERROR_INSUFFICIENT_BUFFER
ERROR_INVALID_FLAGS
ERROR_INVALID_PARAMETER
ERROR_NO_UNICODE_TRANSLATION
Remarks
The lpMultiByteStr and lpWideCharStr pointers must not be the same. If they are the same, the function fails, and GetLastError returns the value ERROR_INVALID_PARAMETER.
The function fails if MB_ERR_INVALID_CHARS is set and it encounters an invalid character in the source string. An invalid character is one that would translate to the default character if MB_ERR_INVALID_CHARS was not set, but is not the default character in the source string, or when a lead byte is found in a string and there is no valid trail byte for DBCS strings. When an invalid character is found, and MB_ERR_INVALID_CHARS is set, the function returns 0 and sets GetLastError with the error ERROR_NO_UNICODE_TRANSLATION.
#4
我一看就看出来是utf8了,粘贴到俺的程序中第一次试验性的转换就成功了。
#1
Private Sub command1_click()
Me.Print (UTF8EncodeURI("频道1"))
End Sub
Function UTF8EncodeURI(szInput)
Dim wch, uch, szRet
Dim x
Dim nAsc, nAsc2, nAsc3
If szInput = "" Then
UTF8EncodeURI = szInput
Exit Function
End If
For x = 1 To Len(szInput)
wch = Mid(szInput, x, 1)
nAsc = AscW(wch)
If nAsc < 0 Then
nAsc = nAsc + 65536
End If
If (nAsc And &HFF80) = 0 Then
szRet = szRet & "&H" & Hex(Asc(wch)) & "&H00"
Else
If (nAsc And &HF000) = 0 Then
uch = "&H" & Hex(((nAsc \ 2 ^ 6)) Or &HC0) & Hex(nAsc And &H3F Or &H80)
szRet = szRet & uch
Else
uch = "&H" & Hex((nAsc \ 2 ^ 12) Or &HE0) & "&H" & _
Hex((nAsc \ 2 ^ 6) And &H3F Or &H80) & "&H" & _
Hex(nAsc And &H3F Or &H80)
szRet = szRet & uch
End If
End If
Next
UTF8EncodeURI = szRet
End Function
#2
'可恶,困扰一整天的问题,写完帖子竟发现答案就在我写的文字中。
'原来debug后,才发现数组是UTF-8的编码,知道编码就太好办了。
'此问题已解决。
Public Declare Function MultiByteToWideChar Lib "kernel32" Alias "MultiByteToWideChar" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As String, ByVal cchMultiByte As Long, ByVal lpWideCharStr As String, ByVal cchWideChar As Long) As Long
Public Const CP_UTF8 = 65001
Public Function UTF8_Decode(bUTF8() As Byte) As String
Dim lRet As Long
Dim lLen As Long
Dim lBufferSize As Long
Dim sBuffer As String
Dim bBuffer() As Byte
lLen = UBound(bUTF8) + 1
If lLen = 0 Then Exit Function
lBufferSize = lLen * 2
sBuffer = String$(lBufferSize, Chr(0))
lRet = MultiByteToWideChar(CP_UTF8, 0, VarPtr(bUTF8(0)), lLen, StrPtr(sBuffer), lBufferSize)
If lRet <> 0 Then
sBuffer = Left(sBuffer, lRet)
End If
UTF8_Decode = sBuffer
End Function
Private Sub Form_Load()
Dim a(6) As Byte, b As String
a(0) = &HE9
a(1) = &HA2
a(2) = &H91
a(3) = &HE9
a(4) = &H81
a(5) = &H93
a(6) = &H31
Text1.Text = UTF8_Decode(a())
End Sub
'原来debug后,才发现数组是UTF-8的编码,知道编码就太好办了。
'此问题已解决。
Public Declare Function MultiByteToWideChar Lib "kernel32" Alias "MultiByteToWideChar" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As String, ByVal cchMultiByte As Long, ByVal lpWideCharStr As String, ByVal cchWideChar As Long) As Long
Public Const CP_UTF8 = 65001
Public Function UTF8_Decode(bUTF8() As Byte) As String
Dim lRet As Long
Dim lLen As Long
Dim lBufferSize As Long
Dim sBuffer As String
Dim bBuffer() As Byte
lLen = UBound(bUTF8) + 1
If lLen = 0 Then Exit Function
lBufferSize = lLen * 2
sBuffer = String$(lBufferSize, Chr(0))
lRet = MultiByteToWideChar(CP_UTF8, 0, VarPtr(bUTF8(0)), lLen, StrPtr(sBuffer), lBufferSize)
If lRet <> 0 Then
sBuffer = Left(sBuffer, lRet)
End If
UTF8_Decode = sBuffer
End Function
Private Sub Form_Load()
Dim a(6) As Byte, b As String
a(0) = &HE9
a(1) = &HA2
a(2) = &H91
a(3) = &HE9
a(4) = &H81
a(5) = &H93
a(6) = &H31
Text1.Text = UTF8_Decode(a())
End Sub
#3
一般最简单的的用法:
MultiByteToWideChar( CP_ACP, 0&, 中文汉字地址, -1&, VB字符串地址, VB字符串长度 )
VB 中的字符串都是以 Unicode 存储的,唯一需要将字符串转换成Unicode的情况是字符串的来源是一个 DLL 或类似情况(或如从别的设备上发送的数据)。
下面是一个自己写的函数,传给它一个含有字符串的 Byte 数组,返回值是 VB 字符串:
'要用到 API, MultiByteToWideChar 和 lstrlenA
Declare Function MultiByteToWideChar Lib "kernel32" Alias "MultiByteToWideChar" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
Declare Function lstrlenA Lib "kernel32" (ByVal lpString As Long) As Long
unction ConvToUnicode(str() As Byte) As String
Const CP_ACP = 0&
Dim lenstr As Long
lenstr = lstrlenA(VarPtr(str(0))) + 2& '获得原始字符串长度
ConvToUnicode = String$(lenstr, vbNullChar) '分配足够大小的内存
lenstr = MultiByteToWideChar(CP_ACP, 0&, VarPtr(str(0)), -1&, StrPtr(ConvToUnicode), lenstr) '转换
ConvToUnicode = Left$(ConvToUnicode, lenstr - 1&) '去处多余字符
End Function
以下是 MultiByteToWideChar 函数的 MSDN 的说明:
The MultiByteToWideChar function maps a character string to a wide-character (Unicode) string. The character string mapped by this function is not necessarily from a multibyte character set.
int MultiByteToWideChar(
UINT CodePage, // code page
DWORD dwFlags, // character-type options
LPCSTR lpMultiByteStr, // address of string to map
int cchMultiByte, // number of characters in string
LPWSTR lpWideCharStr, // address of wide-character buffer
int cchWideChar // size of buffer
);
Parameters
CodePage
Specifies the code page to be used to perform the conversion. This parameter can be given the value of any codepage that is installed or available in the system. The following values may be used to specify one of the system default code pages:
Value Meaning
CP_ACP ANSI code page
CP_MACCP Macintosh code page
CP_OEMCP OEM code page
dwFlags
A set of bit flags that indicate whether to translate to precomposed or composite wide characters (if a composite form exists), whether to use glyph characters in place of control characters, and how to deal with invalid characters. You can specify a combination of the following flag constants:
Value Meaning
MB_PRECOMPOSED Always use precomposed characters ?that is, characters in which a base character and a nonspacing character have a single character value. This is the default translation option. Cannot be used with MB_COMPOSITE.
MB_COMPOSITE Always use composite characters ?that is, characters in which a base character and a nonspacing character have different character values. Cannot be used with MB_PRECOMPOSED.
MB_ERR_INVALID_CHARS If the function encounters an invalid input character, it fails and GetLastError returns ERROR_NO_UNICODE_TRANSLATION.
MB_USEGLYPHCHARS Use glyph characters instead of control characters.
A composite character consists of a base character and a nonspacing character, each having different character values. A precomposed character has a single character value for a base/non-spacing character combination. In the character ? the
e is the base character and the accent grave mark is the nonspacing character.
The function's default behavior is to translate to the precomposed form. If a precomposed form does not exist, the function attempts to translate to a composite form.
The flags MB_PRECOMPOSED and MB_COMPOSITE are mutually exclusive. The MB_USEGLYPHCHARS flag and the MB_ERR_INVALID_CHARS can be set regardless of the state of the other flags.
lpMultiByteStr
Points to the character string to be converted.
cchMultiByte
Specifies the size in bytes of the string pointed to by the lpMultiByteStr parameter. If this value is -1, the string is assumed to be null terminated and the length is calculated automatically.
lpWideCharStr
Points to a buffer that receives the translated string.
cchWideChar
Specifies the size, in wide characters, of the buffer pointed to by the lpWideCharStr parameter. If this value is zero, the function returns the required buffer size, in wide characters, and makes no use of the lpWideCharStr buffer.
Return Values
If the function succeeds, and cchWideChar is nonzero, the return value is the number of wide characters written to the buffer pointed to by lpWideCharStr.
If the function succeeds, and cchWideChar is zero, the return value is the required size, in wide characters, for a buffer that can receive the translated string.
If the function fails, the return value is zero. To get extended error information, call GetLastError. GetLastError may return one of the following error codes:
ERROR_INSUFFICIENT_BUFFER
ERROR_INVALID_FLAGS
ERROR_INVALID_PARAMETER
ERROR_NO_UNICODE_TRANSLATION
Remarks
The lpMultiByteStr and lpWideCharStr pointers must not be the same. If they are the same, the function fails, and GetLastError returns the value ERROR_INVALID_PARAMETER.
The function fails if MB_ERR_INVALID_CHARS is set and it encounters an invalid character in the source string. An invalid character is one that would translate to the default character if MB_ERR_INVALID_CHARS was not set, but is not the default character in the source string, or when a lead byte is found in a string and there is no valid trail byte for DBCS strings. When an invalid character is found, and MB_ERR_INVALID_CHARS is set, the function returns 0 and sets GetLastError with the error ERROR_NO_UNICODE_TRANSLATION.
MultiByteToWideChar( CP_ACP, 0&, 中文汉字地址, -1&, VB字符串地址, VB字符串长度 )
VB 中的字符串都是以 Unicode 存储的,唯一需要将字符串转换成Unicode的情况是字符串的来源是一个 DLL 或类似情况(或如从别的设备上发送的数据)。
下面是一个自己写的函数,传给它一个含有字符串的 Byte 数组,返回值是 VB 字符串:
'要用到 API, MultiByteToWideChar 和 lstrlenA
Declare Function MultiByteToWideChar Lib "kernel32" Alias "MultiByteToWideChar" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
Declare Function lstrlenA Lib "kernel32" (ByVal lpString As Long) As Long
unction ConvToUnicode(str() As Byte) As String
Const CP_ACP = 0&
Dim lenstr As Long
lenstr = lstrlenA(VarPtr(str(0))) + 2& '获得原始字符串长度
ConvToUnicode = String$(lenstr, vbNullChar) '分配足够大小的内存
lenstr = MultiByteToWideChar(CP_ACP, 0&, VarPtr(str(0)), -1&, StrPtr(ConvToUnicode), lenstr) '转换
ConvToUnicode = Left$(ConvToUnicode, lenstr - 1&) '去处多余字符
End Function
以下是 MultiByteToWideChar 函数的 MSDN 的说明:
The MultiByteToWideChar function maps a character string to a wide-character (Unicode) string. The character string mapped by this function is not necessarily from a multibyte character set.
int MultiByteToWideChar(
UINT CodePage, // code page
DWORD dwFlags, // character-type options
LPCSTR lpMultiByteStr, // address of string to map
int cchMultiByte, // number of characters in string
LPWSTR lpWideCharStr, // address of wide-character buffer
int cchWideChar // size of buffer
);
Parameters
CodePage
Specifies the code page to be used to perform the conversion. This parameter can be given the value of any codepage that is installed or available in the system. The following values may be used to specify one of the system default code pages:
Value Meaning
CP_ACP ANSI code page
CP_MACCP Macintosh code page
CP_OEMCP OEM code page
dwFlags
A set of bit flags that indicate whether to translate to precomposed or composite wide characters (if a composite form exists), whether to use glyph characters in place of control characters, and how to deal with invalid characters. You can specify a combination of the following flag constants:
Value Meaning
MB_PRECOMPOSED Always use precomposed characters ?that is, characters in which a base character and a nonspacing character have a single character value. This is the default translation option. Cannot be used with MB_COMPOSITE.
MB_COMPOSITE Always use composite characters ?that is, characters in which a base character and a nonspacing character have different character values. Cannot be used with MB_PRECOMPOSED.
MB_ERR_INVALID_CHARS If the function encounters an invalid input character, it fails and GetLastError returns ERROR_NO_UNICODE_TRANSLATION.
MB_USEGLYPHCHARS Use glyph characters instead of control characters.
A composite character consists of a base character and a nonspacing character, each having different character values. A precomposed character has a single character value for a base/non-spacing character combination. In the character ? the
e is the base character and the accent grave mark is the nonspacing character.
The function's default behavior is to translate to the precomposed form. If a precomposed form does not exist, the function attempts to translate to a composite form.
The flags MB_PRECOMPOSED and MB_COMPOSITE are mutually exclusive. The MB_USEGLYPHCHARS flag and the MB_ERR_INVALID_CHARS can be set regardless of the state of the other flags.
lpMultiByteStr
Points to the character string to be converted.
cchMultiByte
Specifies the size in bytes of the string pointed to by the lpMultiByteStr parameter. If this value is -1, the string is assumed to be null terminated and the length is calculated automatically.
lpWideCharStr
Points to a buffer that receives the translated string.
cchWideChar
Specifies the size, in wide characters, of the buffer pointed to by the lpWideCharStr parameter. If this value is zero, the function returns the required buffer size, in wide characters, and makes no use of the lpWideCharStr buffer.
Return Values
If the function succeeds, and cchWideChar is nonzero, the return value is the number of wide characters written to the buffer pointed to by lpWideCharStr.
If the function succeeds, and cchWideChar is zero, the return value is the required size, in wide characters, for a buffer that can receive the translated string.
If the function fails, the return value is zero. To get extended error information, call GetLastError. GetLastError may return one of the following error codes:
ERROR_INSUFFICIENT_BUFFER
ERROR_INVALID_FLAGS
ERROR_INVALID_PARAMETER
ERROR_NO_UNICODE_TRANSLATION
Remarks
The lpMultiByteStr and lpWideCharStr pointers must not be the same. If they are the same, the function fails, and GetLastError returns the value ERROR_INVALID_PARAMETER.
The function fails if MB_ERR_INVALID_CHARS is set and it encounters an invalid character in the source string. An invalid character is one that would translate to the default character if MB_ERR_INVALID_CHARS was not set, but is not the default character in the source string, or when a lead byte is found in a string and there is no valid trail byte for DBCS strings. When an invalid character is found, and MB_ERR_INVALID_CHARS is set, the function returns 0 and sets GetLastError with the error ERROR_NO_UNICODE_TRANSLATION.
#4
我一看就看出来是utf8了,粘贴到俺的程序中第一次试验性的转换就成功了。