I have arrays of integers, each is a ASCII code representing a single byte of a string.
我有整数数组,每个都是一个ASCII代码,表示字符串的单个字节。
I can generate a string from the array like this:
我可以像这样生成数组中的字符串:
Sub BytesToString()
Dim myArr(): myArr = Array(84, 104, 105, 115, 32, _
105, 115, 32, 97, 32, 116, 101, 115, 116, 33)
Dim c As Variant, myStr As String
For Each c In myArr
myStr = myStr & Chr(c)
Next c
MsgBox myStr
End Sub
...but I feel like this isn't "the right way" to do this, especially since repeated conversions may be needed. Array length will vary.
......但我觉得这不是“正确的方法”,特别是因为可能需要重复转换。数组长度会有所不同。
Is there a built-in or more efficient method to produce the string with VBA?
是否有内置或更有效的方法来生成VBA字符串?
3 个解决方案
#1
3
Turns out, this is one of those rare times where the solution was so simple it was overlooked by several people, including myself.
事实证明,这是一个罕见的时期,解决方案如此简单,被包括我自己在内的几个人所忽视。
???? "Byte Arrays" and Strings are basically interchangeable.
In VBA, Byte Arrays are special because, unlike arrays of other data types, a string can be directly assigned to a byte array.
在VBA中,字节数组是特殊的,因为与其他数据类型的数组不同,字符串可以直接分配给字节数组。
In VBA, Strings are UNICODE strings, so when one assigns a string to a byte array then it stores two digits for each character. The first digit will be the ASCII value of the character and next will be 0.
(Source: VBA Trick of the Week: Byte Arrays in VBA - Useful Gyaan)在VBA中,字符串是UNICODE字符串,因此当一个字符串分配给字节数组时,它会为每个字符存储两个数字。第一个数字是字符的ASCII值,接下来将是0.(来源:本周VBA特技:VBA中的字节数组 - 有用的Gyaan)
A couple code samples will likely demonstrate better than I can explain:
一些代码示例可能会比我能解释的更好:
Sub Demo1()
Dim myArr() As Byte, myStr As String
myStr = "Hi!"
myArr() = myStr
Debug.Print "myStr length: " & Len(myStr) 'returns "3"
Debug.Print "Arr bounds: " & LBound(myArr) &"to"& UBound(myArr) 'returns "0 to 5"
myStr = myArr
Debug.Print myStr 'returns "Hi!"
End Sub
In the above case the string's length is 3 so the array’s size will be 6. Values will be stored in the following way:
在上面的例子中,字符串的长度为3,因此数组的大小为6.值将按以下方式存储:
myArr(0) = 72 ' ASCII : code for 'H'
myArr(1) = 0 ' ASCII 'null' character
myArr(2) = 105 ' ASCII : code for 'i'
myArr(3) = 0 ' ASCII 'null' character
...etc...
myArr(0)= 72'ASCII:代码'H'myArr(1)= 0'ASCII'null'字符myArr(2)= 105'ASCII:代码'i'myArr(3)= 0'ASCII'null '人物......等等......
The
StrConv
function can be used if one wants to remove these zeros. In this case it will store ASCII values only.如果想要删除这些零,可以使用StrConv函数。在这种情况下,它仅存储ASCII值。
myByteArr() = StrConv("*", vbFromUnicode)
Just like a string can be directly assigned to a byte array, a byte array can also be directly assigned to a string. In above example if one assigns
myArr
to a string then it will store the same value that has been assigned to the array.就像字符串可以直接分配给字节数组一样,字节数组也可以直接分配给字符串。在上面的示例中,如果将myArr分配给字符串,则它将存储已分配给该数组的相同值。
When the array is populated element-by-element - or, in my case, from a speedy file operation (see below) - an extra step of conversion with StrConv
is required.
当数组逐个元素填充时 - 或者在我的情况下,从快速文件操作(见下文)填充 - 需要使用StrConv进行额外的转换步骤。
Sub Demo2()
Dim myArr(0 To 5) As Byte, myStr As String
myArr(0) = 104: myArr(1) = 101: myArr(2) = 108
myArr(3) = 108: myArr(4) = 111: myArr(5) = 33
Debug.Print "myArr bounds: " & LBound(myArr) &"to"& UBound(myArr) 'returns "0 to 5"
'since the array was loaded byte-by-byte, we can't "just put back":
myStr = myArr()
Debug.Print myStr 'returns "???" (unprintable characters)
Debug.Print "myStr length: " & Len(myStr) 'returns "3"
'using `StrConv` to allow for 2-byte unicode character storage
myStr = StrConv(myArr(), vbUnicode)
Debug.Print myStr 'returns "hello!"
Debug.Print "myStr length: " & Len(myStr) 'returns "6"
End Sub
How a Byte Array made my day a little better...
I have large text files that I been wanting parse/analyze with VBA, but couldn't find a method that wasn't painfully slow in either the loading or the character-by-character parsing.
我有大量的文本文件,我一直想用VBA进行解析/分析,但是在加载或逐个字符解析时找不到一个不太慢的方法。
As an example, today I managed to load a quarter-gigabyte file in 1/10th of a second, and parsed it into a second Byte Array:
作为一个例子,今天我设法在1/10秒内加载一个四分之一GB的文件,并将其解析为第二个字节数组:
Dim bytes() As Byte
Open myFileName For Binary Access Read As #1
ReDim bytes(LOF(1) - 1&)
Get #1, , bytes
Close #1
For x = LBound(arrOut) To UBound(arrOut)
Select Case bytes(x)
(..and if I want the character)
bytes2(y) = bytes(x)
y = y + 1
End Select
Next x
ReDim Preserve bytes2(LBound(bytes2) To y - 1)
txtIn = StrConv(bytes2, vbUnicode)
...and I had my completed string in under 5 seconds total. (Hooray!)
......我的完整字符串总共不到5秒钟。 (万岁!)
More Information:
- Useful Gyaan : VBA Trick of the Week: Byte Array in VBA
- VS Magazine : VB Corner: Searching Within Byte Arrays
- IBM : Convert strings and arrays of byte
- FastExcel : Writing efficient VBA UDFs – Faster string handling and Byte arrays
- Stack Overflow : How many bytes does one Unicode character take?
- MSDN : Data Type Summary (VBA)
- MSDN : Get Statement (VBA)
有用的Gyaan:本周的VBA技巧:VBA中的字节数组
VS Magazine:VB Corner:在字节数组中搜索
IBM:转换字符串和字节数组
FastExcel:编写高效的VBA UDF - 更快的字符串处理和字节数组
堆栈溢出:一个Unicode字符占用多少字节?
MSDN:数据类型摘要(VBA)
MSDN:获取声明(VBA)
#2
1
The concatenation is the expensive part of this code. This is something you can handle with Join
. I'm not sure this is the proper way of doing it, but it is faster at least:
连接是此代码的昂贵部分。这是您可以使用Join处理的内容。我不确定这是正确的做法,但它至少更快:
For i = LBound(myArr) To UBound(myArr)
myArr(i) = Chr(myArr(i))
Next
MsgBox Join(myArr, "")
#3
1
If you are curious about different ways, you can always count on .NET libraries! In this case, you have to add reference to mscorlib.dll
in your VBA editor and then use this code:
如果您对不同的方式感到好奇,那么您总是可以信赖.NET库!在这种情况下,您必须在VBA编辑器中添加对mscorlib.dll的引用,然后使用以下代码:
Option Explicit
Sub BytesToString()
Dim en As ASCIIEncoding
Set en = New ASCIIEncoding
Dim myArr(0 To 2) As Byte
myArr(0) = 72
myArr(1) = 105
myArr(2) = 33
MsgBox en.GetString(myArr)
End Sub
Since you are looking for built-in functions, that is one. But it's inefficient. Approximately takes 10 times longer than your custom decoder as I checked.
既然您正在寻找内置函数,那就是其中之一。但它效率低下。在我查看时,大约比自定义解码器长10倍。
UPDATE
However, when I check this in .NET (C#), it is approximately 20 times faster than custom approach presented by OP.
但是,当我在.NET(C#)中检查它时,它比OP提供的自定义方法快大约20倍。
#1
3
Turns out, this is one of those rare times where the solution was so simple it was overlooked by several people, including myself.
事实证明,这是一个罕见的时期,解决方案如此简单,被包括我自己在内的几个人所忽视。
???? "Byte Arrays" and Strings are basically interchangeable.
In VBA, Byte Arrays are special because, unlike arrays of other data types, a string can be directly assigned to a byte array.
在VBA中,字节数组是特殊的,因为与其他数据类型的数组不同,字符串可以直接分配给字节数组。
In VBA, Strings are UNICODE strings, so when one assigns a string to a byte array then it stores two digits for each character. The first digit will be the ASCII value of the character and next will be 0.
(Source: VBA Trick of the Week: Byte Arrays in VBA - Useful Gyaan)在VBA中,字符串是UNICODE字符串,因此当一个字符串分配给字节数组时,它会为每个字符存储两个数字。第一个数字是字符的ASCII值,接下来将是0.(来源:本周VBA特技:VBA中的字节数组 - 有用的Gyaan)
A couple code samples will likely demonstrate better than I can explain:
一些代码示例可能会比我能解释的更好:
Sub Demo1()
Dim myArr() As Byte, myStr As String
myStr = "Hi!"
myArr() = myStr
Debug.Print "myStr length: " & Len(myStr) 'returns "3"
Debug.Print "Arr bounds: " & LBound(myArr) &"to"& UBound(myArr) 'returns "0 to 5"
myStr = myArr
Debug.Print myStr 'returns "Hi!"
End Sub
In the above case the string's length is 3 so the array’s size will be 6. Values will be stored in the following way:
在上面的例子中,字符串的长度为3,因此数组的大小为6.值将按以下方式存储:
myArr(0) = 72 ' ASCII : code for 'H'
myArr(1) = 0 ' ASCII 'null' character
myArr(2) = 105 ' ASCII : code for 'i'
myArr(3) = 0 ' ASCII 'null' character
...etc...
myArr(0)= 72'ASCII:代码'H'myArr(1)= 0'ASCII'null'字符myArr(2)= 105'ASCII:代码'i'myArr(3)= 0'ASCII'null '人物......等等......
The
StrConv
function can be used if one wants to remove these zeros. In this case it will store ASCII values only.如果想要删除这些零,可以使用StrConv函数。在这种情况下,它仅存储ASCII值。
myByteArr() = StrConv("*", vbFromUnicode)
Just like a string can be directly assigned to a byte array, a byte array can also be directly assigned to a string. In above example if one assigns
myArr
to a string then it will store the same value that has been assigned to the array.就像字符串可以直接分配给字节数组一样,字节数组也可以直接分配给字符串。在上面的示例中,如果将myArr分配给字符串,则它将存储已分配给该数组的相同值。
When the array is populated element-by-element - or, in my case, from a speedy file operation (see below) - an extra step of conversion with StrConv
is required.
当数组逐个元素填充时 - 或者在我的情况下,从快速文件操作(见下文)填充 - 需要使用StrConv进行额外的转换步骤。
Sub Demo2()
Dim myArr(0 To 5) As Byte, myStr As String
myArr(0) = 104: myArr(1) = 101: myArr(2) = 108
myArr(3) = 108: myArr(4) = 111: myArr(5) = 33
Debug.Print "myArr bounds: " & LBound(myArr) &"to"& UBound(myArr) 'returns "0 to 5"
'since the array was loaded byte-by-byte, we can't "just put back":
myStr = myArr()
Debug.Print myStr 'returns "???" (unprintable characters)
Debug.Print "myStr length: " & Len(myStr) 'returns "3"
'using `StrConv` to allow for 2-byte unicode character storage
myStr = StrConv(myArr(), vbUnicode)
Debug.Print myStr 'returns "hello!"
Debug.Print "myStr length: " & Len(myStr) 'returns "6"
End Sub
How a Byte Array made my day a little better...
I have large text files that I been wanting parse/analyze with VBA, but couldn't find a method that wasn't painfully slow in either the loading or the character-by-character parsing.
我有大量的文本文件,我一直想用VBA进行解析/分析,但是在加载或逐个字符解析时找不到一个不太慢的方法。
As an example, today I managed to load a quarter-gigabyte file in 1/10th of a second, and parsed it into a second Byte Array:
作为一个例子,今天我设法在1/10秒内加载一个四分之一GB的文件,并将其解析为第二个字节数组:
Dim bytes() As Byte
Open myFileName For Binary Access Read As #1
ReDim bytes(LOF(1) - 1&)
Get #1, , bytes
Close #1
For x = LBound(arrOut) To UBound(arrOut)
Select Case bytes(x)
(..and if I want the character)
bytes2(y) = bytes(x)
y = y + 1
End Select
Next x
ReDim Preserve bytes2(LBound(bytes2) To y - 1)
txtIn = StrConv(bytes2, vbUnicode)
...and I had my completed string in under 5 seconds total. (Hooray!)
......我的完整字符串总共不到5秒钟。 (万岁!)
More Information:
- Useful Gyaan : VBA Trick of the Week: Byte Array in VBA
- VS Magazine : VB Corner: Searching Within Byte Arrays
- IBM : Convert strings and arrays of byte
- FastExcel : Writing efficient VBA UDFs – Faster string handling and Byte arrays
- Stack Overflow : How many bytes does one Unicode character take?
- MSDN : Data Type Summary (VBA)
- MSDN : Get Statement (VBA)
有用的Gyaan:本周的VBA技巧:VBA中的字节数组
VS Magazine:VB Corner:在字节数组中搜索
IBM:转换字符串和字节数组
FastExcel:编写高效的VBA UDF - 更快的字符串处理和字节数组
堆栈溢出:一个Unicode字符占用多少字节?
MSDN:数据类型摘要(VBA)
MSDN:获取声明(VBA)
#2
1
The concatenation is the expensive part of this code. This is something you can handle with Join
. I'm not sure this is the proper way of doing it, but it is faster at least:
连接是此代码的昂贵部分。这是您可以使用Join处理的内容。我不确定这是正确的做法,但它至少更快:
For i = LBound(myArr) To UBound(myArr)
myArr(i) = Chr(myArr(i))
Next
MsgBox Join(myArr, "")
#3
1
If you are curious about different ways, you can always count on .NET libraries! In this case, you have to add reference to mscorlib.dll
in your VBA editor and then use this code:
如果您对不同的方式感到好奇,那么您总是可以信赖.NET库!在这种情况下,您必须在VBA编辑器中添加对mscorlib.dll的引用,然后使用以下代码:
Option Explicit
Sub BytesToString()
Dim en As ASCIIEncoding
Set en = New ASCIIEncoding
Dim myArr(0 To 2) As Byte
myArr(0) = 72
myArr(1) = 105
myArr(2) = 33
MsgBox en.GetString(myArr)
End Sub
Since you are looking for built-in functions, that is one. But it's inefficient. Approximately takes 10 times longer than your custom decoder as I checked.
既然您正在寻找内置函数,那就是其中之一。但它效率低下。在我查看时,大约比自定义解码器长10倍。
UPDATE
However, when I check this in .NET (C#), it is approximately 20 times faster than custom approach presented by OP.
但是,当我在.NET(C#)中检查它时,它比OP提供的自定义方法快大约20倍。