为什么`buffer`和`new Buffer(buffer.toString())`并不总是逐字节相等?

时间:2022-02-08 03:59:10

I was expecting that new Buffer(buffer.toString()) would always be byte-for-byte equal. However, I am encountering a case where it is not true.

我期待新的Buffer(buffer.toString())始终是逐字节相等的。但是,我遇到的情况并非如此。

First, a case where it is true:

首先,它是真实的情况:

var buf1 = new Buffer(32);                                                                                                                                                                                  
for (var i = 0 ; i < 32 ; i++) {                                                                                                                                                                            
  buf1[i] = i;                                                                                                                                                                                              
} 

console.log(buf1);                                                                                                                                                                                          
console.log(new Buffer(buf1.toString())); 

<Buffer 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f>
<Buffer 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f>

However, here is a case where it is not true:

但是,这是一个不成立的情况:

var buf2 = crypto.createHmac('sha256', 'key')                                                                                                                                                                 
    .update('string')                                                                                                                                                                                        
    .digest();

console.log(buf2);                                                                                                                                                                                          
console.log(new Buffer(buf2.toString()));

<Buffer 97 d1 5b ea ba 06 0d 07 38 ec 75 9e a3 18 65 17 8a b8 bb 78 1b 2d 21 07 64 4b a8 81 f3 99 d8 d6>
<Buffer ef bf bd ef bf bd 5b ef bf bd ef bf bd 06 0d 07 38 ef bf bd 75 ef bf bd ef bf bd 18 65 17 ef bf bd ef bf bd ef bf bd 78 1b 2d 21 07 64 4b ef bf bd ef ... >

What is different about buf2 that makes new Buffer(buf2.toString()) not byte-equivalent to buf2?

有什么不同的buf2使得新的Buffer(buf2.toString())与buf2不等效?

1 个解决方案

#1


2  

A Buffer is an object as far as JS is concerned, so you're comparing object references. Since the two Buffers are not actually the same instance, that kind of equality check (== or ===) will never be true.

就JS而言,Buffer是一个对象,所以你要比较对象引用。由于两个缓冲区实际上并不是同一个实例,因此这种等式检查(==或===)永远不会成立。

For comparing Buffer contents you could use something like buffer.equals(buffer2) if you have node v0.12 or newer. For older node versions, you will have to use a loop to check byte-by-byte.

为了比较缓冲区内容,如果你有节点v0.12或更新,你可以使用像buffer.equals(buffer2)这样的东西。对于较旧的节点版本,您必须使用循环来逐字节检查。

Additional explanation:

Calling .toString() converts the binary data to UTF-8. If there are invalid UTF-8 characters in that data, those characters will typically be replaced by the replacement character of \uFFFD. When this replacement occurs, the content is now different, causing equals() to return false. In fact, you can see this in the second Buffer (the instances of ef bf bd).

调用.toString()将二进制数据转换为UTF-8。如果该数据中存在无效的UTF-8字符,则这些字符通常将替换为\ uFFFD的替换字符。发生此替换时,内容现在不同,导致equals()返回false。实际上,您可以在第二个缓冲区(ef bf bd的实例)中看到这一点。

#1


2  

A Buffer is an object as far as JS is concerned, so you're comparing object references. Since the two Buffers are not actually the same instance, that kind of equality check (== or ===) will never be true.

就JS而言,Buffer是一个对象,所以你要比较对象引用。由于两个缓冲区实际上并不是同一个实例,因此这种等式检查(==或===)永远不会成立。

For comparing Buffer contents you could use something like buffer.equals(buffer2) if you have node v0.12 or newer. For older node versions, you will have to use a loop to check byte-by-byte.

为了比较缓冲区内容,如果你有节点v0.12或更新,你可以使用像buffer.equals(buffer2)这样的东西。对于较旧的节点版本,您必须使用循环来逐字节检查。

Additional explanation:

Calling .toString() converts the binary data to UTF-8. If there are invalid UTF-8 characters in that data, those characters will typically be replaced by the replacement character of \uFFFD. When this replacement occurs, the content is now different, causing equals() to return false. In fact, you can see this in the second Buffer (the instances of ef bf bd).

调用.toString()将二进制数据转换为UTF-8。如果该数据中存在无效的UTF-8字符,则这些字符通常将替换为\ uFFFD的替换字符。发生此替换时,内容现在不同,导致equals()返回false。实际上,您可以在第二个缓冲区(ef bf bd的实例)中看到这一点。