比较。net中的两个字节数组。

How can I do this fast?

我怎么能这么快?

Sure I can do this:

当然我能做到:

static bool ByteArrayCompare(byte[] a1, byte[] a2)
{
    if (a1.Length != a2.Length)
        return false;

    for (int i=0; i<a1.Length; i++)
        if (a1[i]!=a2[i])
            return false;

    return true;
}

But I'm looking for either a BCL function or some highly optimized proven way to do this.

但我在寻找一个BCL函数或者一些经过高度优化的证明方法。

java.util.Arrays.equals((sbyte[])(Array)a1, (sbyte[])(Array)a2);

works nicely, but it doesn't look like that would work for x64.

可以很好地工作，但它看起来不像x64。

Note my super-fast answer here.

注意我的超快答案。

27 个解决方案

#1

496

You can use Enumerable.SequenceEqual method.

您可以使用枚举。SequenceEqual方法。

using System;
using System.Linq;
...
var a1 = new int[] { 1, 2, 3};
var a2 = new int[] { 1, 2, 3};
var a3 = new int[] { 1, 2, 4};
var x = a1.SequenceEqual(a2); // true
var y = a1.SequenceEqual(a3); // false

If you can't use .NET 3.5 for some reason, your method is OK.
Compiler\run-time environment will optimize your loop so you don't need to worry about performance.

如果你不能使用。net 3.5，你的方法是可以的。编译器\运行时环境将优化您的循环，这样您就不必担心性能问题。

#2

216

P/Invoke powers activate!

P / Invoke力量激活!

[DllImport("msvcrt.dll", CallingConvention=CallingConvention.Cdecl)]
static extern int memcmp(byte[] b1, byte[] b2, long count);

static bool ByteArrayCompare(byte[] b1, byte[] b2)
{
    // Validate buffers are the same length.
    // This also ensures that the count does not exceed the length of either buffer.  
    return b1.Length == b2.Length && memcmp(b1, b2, b1.Length) == 0;
}

#3

150

There's a new built-in solution for this in .NET 4 - IStructuralEquatable

在。net 4 - IStructuralEquatable中有一个新的内置解决方案。

static bool ByteArrayCompare(byte[] a1, byte[] a2) 
{
    return StructuralComparisons.StructuralEqualityComparer.Equals(a1, a2);
}

#4

User gil suggested unsafe code which spawned this solution:

用户gil建议不安全的代码生成这个解决方案:

// Copyright (c) 2008-2013 Hafthor Stefansson
// Distributed under the MIT/X11 software license
// Ref: http://www.opensource.org/licenses/mit-license.php.
static unsafe bool UnsafeCompare(byte[] a1, byte[] a2) {
  if(a1==a2) return true;
  if(a1==null || a2==null || a1.Length!=a2.Length)
    return false;
  fixed (byte* p1=a1, p2=a2) {
    byte* x1=p1, x2=p2;
    int l = a1.Length;
    for (int i=0; i < l/8; i++, x1+=8, x2+=8)
      if (*((long*)x1) != *((long*)x2)) return false;
    if ((l & 4)!=0) { if (*((int*)x1)!=*((int*)x2)) return false; x1+=4; x2+=4; }
    if ((l & 2)!=0) { if (*((short*)x1)!=*((short*)x2)) return false; x1+=2; x2+=2; }
    if ((l & 1)!=0) if (*((byte*)x1) != *((byte*)x2)) return false;
    return true;
  }
}

which does 64-bit based comparison for as much of the array as possible. This kind of counts on the fact that the arrays start qword aligned. It'll work if not qword aligned, just not as fast as if it were.

它对尽可能多的数组进行基于64位的比较。这样的计数是基于数组开始对齐的事实。如果不是qword对齐，它就会起作用，只是不像它那样快。

It performs about seven timers faster than the simple for loop. Using the J# library performed equivalently to the original for loop. Using .SequenceEqual runs around seven times slower; I think just because it is using IEnumerator.MoveNext. I imagine LINQ-based solutions being at least that slow or worse.

它比简单的for循环要快7个计时器。使用j#库与原来的for循环等效。使用.SequenceEqual大约要慢7倍;我认为只是因为它使用了IEnumerator.MoveNext。我认为基于linq的解决方案至少是那么慢或更糟。

#5

If you are not opposed to doing it, you can import the J# assembly "vjslib.dll" and use its Arrays.equals(byte[], byte[]) method...

如果您不反对这样做，您可以导入j#程序集“vjslib”。并使用它的数组。equals(byte[],byte[])方法……

Don't blame me if someone laughs at you though...

如果有人嘲笑你，不要怪我……

EDIT: For what little it is worth, I used Reflector to disassemble the code for that, and here is what it looks like:

编辑:我用反射器来分解代码，这里是它的样子:

public static bool equals(sbyte[] a1, sbyte[] a2)
{
  if (a1 == a2)
  {
    return true;
  }
  if ((a1 != null) && (a2 != null))
  {
    if (a1.Length != a2.Length)
    {
      return false;
    }
    for (int i = 0; i < a1.Length; i++)
    {
      if (a1[i] != a2[i])
      {
        return false;
      }
    }
    return true;
  }
  return false;
}

#6

.NET 3.5 and newer have a new public type, System.Data.Linq.Binary that encapsulates byte[]. It implements IEquatable<Binary> that (in effect) compares two byte arrays. Note that System.Data.Linq.Binary also has implicit conversion operator from byte[].

net 3.5和更新版本有一个新的公共类型System.Data.Linq。二进制,封装byte[]。它实现了IEquatable <二进制> (实际上)比较两个字节数组。注意,System.Data.Linq。二进制也有来自byte的隐式转换运算符[]。

MSDN documentation:System.Data.Linq.Binary

MSDN文档:System.Data.Linq.Binary

Reflector decompile of the Equals method:

=方法的反射器反编译:

private bool EqualsTo(Binary binary)
{
    if (this != binary)
    {
        if (binary == null)
        {
            return false;
        }
        if (this.bytes.Length != binary.bytes.Length)
        {
            return false;
        }
        if (this.hashCode != binary.hashCode)
        {
            return false;
        }
        int index = 0;
        int length = this.bytes.Length;
        while (index < length)
        {
            if (this.bytes[index] != binary.bytes[index])
            {
                return false;
            }
            index++;
        }
    }
    return true;
}

Interesting twist is that they only proceed to byte-by-byte comparison loop if hashes of the two Binary objects are the same. This, however, comes at the cost of computing the hash in constructor of Binary objects (by traversing the array with for loop :-) ).

有趣的是，如果两个二进制对象的散列是相同的，它们只会逐字节比较循环。然而，这需要计算二进制对象的构造函数的哈希值(通过遍历数组，以循环:-))。

The above implementation means that in the worst case you may have to traverse the arrays three times: first to compute hash of array1, then to compute hash of array2 and finally (because this is the worst case scenario, lengths and hashes equal) to compare bytes in array1 with bytes in array 2.

上面的实现意味着在最坏的情况下您可能必须遍历阵列三次:第一次array1计算散列,然后计算散列array2最后(因为这是最坏的情况下,长度和散列等)比较在array1字节字节数组2。

Overall, even though System.Data.Linq.Binary is built into BCL, I don't think it is the fastest way to compare two byte arrays :-|.

总的来说,尽管System.Data.Linq。二进制被内置到BCL中，我不认为这是比较两个字节数组的最快方法:-|。

#7

I posted a similar question about checking if byte[] is full of zeroes. (SIMD code was beaten so I removed it from this answer.) Here is fastest code from my comparisons:

我发布了一个类似的问题，就是检查字节是否全是0。(SIMD代码被打了，所以我从这个答案中删除了它。)以下是我比较中最快的代码:

static unsafe bool EqualBytesLongUnrolled (byte[] data1, byte[] data2)
{
    if (data1 == data2)
        return true;
    if (data1.Length != data2.Length)
        return false;

    fixed (byte* bytes1 = data1, bytes2 = data2) {
        int len = data1.Length;
        int rem = len % (sizeof(long) * 16);
        long* b1 = (long*)bytes1;
        long* b2 = (long*)bytes2;
        long* e1 = (long*)(bytes1 + len - rem);

        while (b1 < e1) {
            if (*(b1) != *(b2) || *(b1 + 1) != *(b2 + 1) || 
                *(b1 + 2) != *(b2 + 2) || *(b1 + 3) != *(b2 + 3) ||
                *(b1 + 4) != *(b2 + 4) || *(b1 + 5) != *(b2 + 5) || 
                *(b1 + 6) != *(b2 + 6) || *(b1 + 7) != *(b2 + 7) ||
                *(b1 + 8) != *(b2 + 8) || *(b1 + 9) != *(b2 + 9) || 
                *(b1 + 10) != *(b2 + 10) || *(b1 + 11) != *(b2 + 11) ||
                *(b1 + 12) != *(b2 + 12) || *(b1 + 13) != *(b2 + 13) || 
                *(b1 + 14) != *(b2 + 14) || *(b1 + 15) != *(b2 + 15))
                return false;
            b1 += 16;
            b2 += 16;
        }

        for (int i = 0; i < rem; i++)
            if (data1 [len - 1 - i] != data2 [len - 1 - i])
                return false;

        return true;
    }
}

Measured on two 256MB byte arrays:

测量两个256MB字节数组:

UnsafeCompare                           : 86,8784 ms
EqualBytesSimd                          : 71,5125 ms
EqualBytesSimdUnrolled                  : 73,1917 ms
EqualBytesLongUnrolled                  : 39,8623 ms

#8

 using System.Linq; //SequenceEqual

 byte[] ByteArray1 = null;
 byte[] ByteArray2 = null;

 ByteArray1 = MyFunct1();
 ByteArray2 = MyFunct2();

 if (ByteArray1.SequenceEqual<byte>(ByteArray2) == true)
 {
    MessageBox.Show("Match");
 }
 else
 {
   MessageBox.Show("Don't match");
 }

#9

Span<T> offers an extremely competitive alternative without having to throw confusing and/or non-portable crap into your own application's code base:

Span 提供了一个极具竞争力的替代方案，无需将混乱和/或不可移植的垃圾放入您自己的应用程序的代码库中:

// byte[] is implicitly convertible to ReadOnlySpan<byte>
static bool ByteArrayCompare(ReadOnlySpan<byte> a1, ReadOnlySpan<byte> a2)
{
    return a1.SequenceEqual(a2);
}

The (guts of the) implementation can be found here.

在这里可以找到实现的核心。

I've revised @EliArbel's gist to add this method as SpansEqual, drop most of the less interesting performers in others' benchmarks, run it with different array sizes, output graphs, and mark SpansEqual as the baseline so that it reports how the different methods compare to SpansEqual.

我修改了@EliArbel的要点，将此方法添加为SpansEqual，将大多数不太有趣的表演者放到其他的基准测试中，用不同的数组大小、输出图和mark SpansEqual来运行它，这样它就可以报告不同的方法与SpansEqual相比有什么不同。

The below numbers are from the results, lightly edited to reorder (ByteCount=1026 was reported about ByteCount=15 for some reason), and tweaked so that it fits below without a horizontal scroll (removed "Error" column and shrunk names).

下面的数字来自于结果，经过稍微编辑后重新排序(ByteCount=1026是由于某种原因被报道的ByteCount=15)，并进行了调整，使其适合于没有水平滚动的情况下(删除了“错误”列和收缩的名称)。

|        Method |  ByteCount |               Mean |          StdDev | Scaled | ScaledSD |
|-------------- |----------- |-------------------:|----------------:|-------:|---------:|
|    SpansEqual |         15 |           3.127 ns |       0.0253 ns |   1.00 |     0.00 |
|  LongPointers |         15 |           4.152 ns |       0.0111 ns |   1.33 |     0.01 |
|    LPUnrolled |         15 |          17.713 ns |       0.0253 ns |   5.67 |     0.04 |
| PInvokeMemcmp |         15 |          12.136 ns |       0.0401 ns |   3.88 |     0.03 |
|               |            |                    |                 |        |          |
|    SpansEqual |       1026 |          31.153 ns |       0.0347 ns |   1.00 |     0.00 |
|  LongPointers |       1026 |          72.723 ns |       0.0451 ns |   2.33 |     0.00 |
|    LPUnrolled |       1026 |          37.151 ns |       0.0571 ns |   1.19 |     0.00 |
| PInvokeMemcmp |       1026 |          39.117 ns |       0.0565 ns |   1.26 |     0.00 |
|               |            |                    |                 |        |          |
|    SpansEqual | 2147483591 | 254,159,485.524 ns | 204,770.1312 ns |   1.00 |     0.00 |
|  LongPointers | 2147483591 | 253,391,428.975 ns | 511,971.4753 ns |   1.00 |     0.00 |
|    LPUnrolled | 2147483591 | 245,669,935.971 ns | 899,789.1015 ns |   0.97 |     0.00 |
| PInvokeMemcmp | 2147483591 | 246,517,934.795 ns | 306,782.8512 ns |   0.97 |     0.00 |

I was surprised to see SpansEqual not come out on top for the max-array-size methods, but the difference is so minor that I don't think it'll ever matter.

我惊讶地发现，在max- arraysize的方法上，SpansEqual并没有出现在顶部，但差异是如此之小，以至于我不认为它会有任何影响。

My system info:

我的系统信息:

BenchmarkDotNet=v0.10.12, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.192)
Intel Core i7-6850K CPU 3.60GHz (Skylake), 1 CPU, 12 logical cores and 6 physical cores
Frequency=3515618 Hz, Resolution=284.4450 ns, Timer=TSC
.NET Core SDK=2.1.4
  [Host]     : .NET Core 2.0.5 (Framework 4.6.26020.03), 64bit RyuJIT
  DefaultJob : .NET Core 2.0.5 (Framework 4.6.26020.03), 64bit RyuJIT

I did generate some graphs, but they look pretty useless because they don't compare across methods within a given array size, so I don't have any to post here...

我确实生成了一些图，但是它们看起来很没用，因为它们在给定数组大小的方法中没有进行比较，所以我在这里没有任何可以发布的…

#10

I would use unsafe code and run the for loop comparing Int32 pointers.

我将使用不安全的代码并运行for循环来比较Int32指针。

Maybe you should also consider checking the arrays to be non-null.

也许您还应该考虑检查数组是否为非空。

#11

If you look at how .NET does string.Equals, you see that it uses a private method called EqualsHelper which has an "unsafe" pointer implementation. .NET Reflector is your friend to see how things are done internally.

如果你看看。net是如何做字符串的。您可以看到它使用了一个名为EqualsHelper的私有方法，它有一个“不安全的”指针实现。

This can be used as a template for byte array comparison which I did an implementation on in blog post Fast byte array comparison in C#. I also did some rudimentary benchmarks to see when a safe implementation is faster than the unsafe.

这可以用作字节数组比较的模板，在c#中，我在blog post快速字节数组比较中做了一个实现。我还做了一些基本的基准测试，以确保安全实现的速度比不安全的更快。

That said, unless you really need killer performance, I'd go for a simple fr loop comparison.

也就是说，除非你真的需要杀手级的性能，否则我会选择一个简单的fr循环比较。

#12

I developed a method that slightly beats memcmp() (plinth's answer) and very slighly beats EqualBytesLongUnrolled() (Arek Bulski's answer). Basically, it unrolls the loop by 4 instead of 8.

我开发了一种方法，它可以稍微地击败memcmp() (plinth的答案)，并且非常轻地击败equalbyteslongun() (Arek Bulski的答案)。基本上，它将循环的循环次数改为4，而不是8。

public static unsafe bool NewMemCmp(byte* b0, byte* b1, int length)
{
    byte* lastAddr = b0 + length;
    byte* lastAddrMinus32 = lastAddr - 32;
    while (b0 < lastAddrMinus32) // unroll the loop so that we are comparing 32 bytes at a time.
    {
        if (*(ulong*)b0 != *(ulong*)b1) return false;
        if (*(ulong*)(b0 + 8) != *(ulong*)(b1 + 8)) return false;
        if (*(ulong*)(b0 + 16) != *(ulong*)(b1 + 16)) return false;
        if (*(ulong*)(b0 + 24) != *(ulong*)(b1 + 24)) return false;
        b0 += 32;
        b1 += 32;
    }
    while (b0 < lastAddr)
    {
        if (*b0 != *b1) return false;
        b0++;
        b1++;
    }
    return true;
}

public static unsafe bool NewMemCmp(byte[] arr0, byte[] arr1, int length)
{
    fixed (byte* b0 = arr0) fixed (byte* b1 = arr1)
    {
        return NewMemCmp(b0, b1, length);
    }
}

This runs about 25% faster than memcmp() and about 5% faster than EqualBytesLongUnrolled() on my machine.

这比memcmp()快25%，比我机器上的equalbyteslongun()快5%。

#13

Let's add one more!

让我们再添加一个!

Recently Microsoft released a special NuGet package, System.Runtime.CompilerServices.Unsafe. It's special because it's written in IL, and provides low-level functionality not directly available in C#.

最近，微软发布了一个特殊的NuGet包，system . runtime.com。它是特殊的，因为它是在IL中编写的，并且提供了在c#中不能直接使用的底层功能。

One of its methods, Unsafe.As<T>(object) allows casting any reference type to another reference type, skipping any safety checks. This is usually a very bad idea, but if both types have the same structure, it can work. So we can use this to cast a byte[] to a long[]:

它的一个方法Unsafe.As (object)允许将任何引用类型转换为另一个引用类型，跳过任何安全检查。这通常是一个非常糟糕的想法，但是如果这两种类型都有相同的结构，那么它就可以工作了。因此，我们可以使用这个来将一个字节[]转换成一个长[]:

bool CompareWithUnsafeLibrary(byte[] a1, byte[] a2)
{
    if (a1.Length != a2.Length) return false;

    var longSize = (int)Math.Floor(a1.Length / 8.0);
    var long1 = Unsafe.As<long[]>(a1);
    var long2 = Unsafe.As<long[]>(a2);

    for (var i = 0; i < longSize; i++)
    {
        if (long1[i] != long2[i]) return false;
    }

    for (var i = longSize * 8; i < a1.Length; i++)
    {
        if (a1[i] != a2[i]) return false;
    }

    return true;
}

Note that long1.Length would still return the original array's length, since it's stored in a field in the array's memory structure.

注意,long1。长度仍然会返回原始数组的长度，因为它存储在数组内存结构中的字段中。

This method is not quite as fast as other methods demonstrated here, but it is a lot faster than the naive method, doesn't use unsafe code or P/Invoke or pinning, and the implementation is quite straightforward (IMO). Here are some BenchmarkDotNet results from my machine:

这个方法不像这里演示的其他方法那样快，但是它比简单的方法快得多，不使用不安全的代码或P/Invoke或pin，而且实现非常简单(IMO)。以下是我的机器的一些基准测试结果:

BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4870HQ CPU 2.50GHz, ProcessorCount=8
Frequency=2435775 Hz, Resolution=410.5470 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0
  DefaultJob : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0

                 Method |          Mean |    StdDev |
----------------------- |-------------- |---------- |
          UnsafeLibrary |   125.8229 ns | 0.3588 ns |
          UnsafeCompare |    89.9036 ns | 0.8243 ns |
           JSharpEquals | 1,432.1717 ns | 1.3161 ns |
 EqualBytesLongUnrolled |    43.7863 ns | 0.8923 ns |
              NewMemCmp |    65.4108 ns | 0.2202 ns |
            ArraysEqual |   910.8372 ns | 2.6082 ns |
          PInvokeMemcmp |    52.7201 ns | 0.1105 ns |

I've also created a gist with all the tests.

我还为所有的测试创建了一个要点。

#14

It seems that EqualBytesLongUnrolled is the best from the above suggested.

从上面的建议看来，equalbyteslongun是最好的。

Skipped methods (Enumerable.SequenceEqual,StructuralComparisons.StructuralEqualityComparer.Equals), were not-patient-for-slow. On 265MB arrays I have measured this:

跳过的方法(枚举。sequenceequal, structuralcomparons.structuralequalitycomparer . equals)，是没有耐心的。在265MB的数组中，我测量了这个:

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-3770 CPU 3.40GHz, ProcessorCount=8
Frequency=3323582 ticks, Resolution=300.8802 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=64-bit RELEASE [RyuJIT]
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1590.0

Type=CompareMemoriesBenchmarks  Mode=Throughput  

                 Method |      Median |    StdDev | Scaled | Scaled-SD |
----------------------- |------------ |---------- |------- |---------- |
             NewMemCopy |  30.0443 ms | 1.1880 ms |   1.00 |      0.00 |
 EqualBytesLongUnrolled |  29.9917 ms | 0.7480 ms |   0.99 |      0.04 |
          msvcrt_memcmp |  30.0930 ms | 0.2964 ms |   1.00 |      0.03 |
          UnsafeCompare |  31.0520 ms | 0.7072 ms |   1.03 |      0.04 |
       ByteArrayCompare | 212.9980 ms | 2.0776 ms |   7.06 |      0.25 |

OS=Windows
Processor=?, ProcessorCount=8
Frequency=3323582 ticks, Resolution=300.8802 ns, Timer=TSC
CLR=CORE, Arch=64-bit ? [RyuJIT]
GC=Concurrent Workstation
dotnet cli version: 1.0.0-preview2-003131

Type=CompareMemoriesBenchmarks  Mode=Throughput  

                 Method |      Median |    StdDev | Scaled | Scaled-SD |
----------------------- |------------ |---------- |------- |---------- |
             NewMemCopy |  30.1789 ms | 0.0437 ms |   1.00 |      0.00 |
 EqualBytesLongUnrolled |  30.1985 ms | 0.1782 ms |   1.00 |      0.01 |
          msvcrt_memcmp |  30.1084 ms | 0.0660 ms |   1.00 |      0.00 |
          UnsafeCompare |  31.1845 ms | 0.4051 ms |   1.03 |      0.01 |
       ByteArrayCompare | 212.0213 ms | 0.1694 ms |   7.03 |      0.01 |

#15

For comparing short byte arrays the following is an interesting hack:

对于比较短字节数组，下面是一个有趣的hack:

if(myByteArray1.Length != myByteArray2.Length) return false;
if(myByteArray1.Length == 8)
   return BitConverter.ToInt64(myByteArray1, 0) == BitConverter.ToInt64(myByteArray2, 0); 
else if(myByteArray.Length == 4)
   return BitConverter.ToInt32(myByteArray2, 0) == BitConverter.ToInt32(myByteArray2, 0);

Then I would probably fall out to the solution listed in the question.

然后我可能会掉到问题中列出的解决方案。

It'd be interesting to do a performance analysis of this code.

对这段代码进行性能分析是很有趣的。

#16

Couldn't find a solution I'm completely happy with (reasonable performance, but no unsafe code/pinvoke) so I came up with this, nothing really original, but works:

找不到一个我完全满意的解决方案(合理的性能，但是没有不安全的代码/pinvoke)，所以我提出了这个，没有什么真正的原创，但是工作:

    /// <summary>
    /// 
    /// </summary>
    /// <param name="array1"></param>
    /// <param name="array2"></param>
    /// <param name="bytesToCompare"> 0 means compare entire arrays</param>
    /// <returns></returns>
    public static bool ArraysEqual(byte[] array1, byte[] array2, int bytesToCompare = 0)
    {
        if (array1.Length != array2.Length) return false;

        var length = (bytesToCompare == 0) ? array1.Length : bytesToCompare;
        var tailIdx = length - length % sizeof(Int64);

        //check in 8 byte chunks
        for (var i = 0; i < tailIdx; i += sizeof(Int64))
        {
            if (BitConverter.ToInt64(array1, i) != BitConverter.ToInt64(array2, i)) return false;
        }

        //check the remainder of the array, always shorter than 8 bytes
        for (var i = tailIdx; i < length; i++)
        {
            if (array1[i] != array2[i]) return false;
        }

        return true;
    }

Performance compared with some of the other solutions on this page:

性能与此页面上的其他解决方案相比:

Simple Loop: 19837 ticks, 1.00

简单循环:19837滴答，1.00。

*BitConverter: 4886 ticks, 4.06

* BitConverter:4886个时钟节拍,4.06

UnsafeCompare: 1636 ticks, 12.12

UnsafeCompare:1636个时钟节拍,12.12

EqualBytesLongUnrolled: 637 ticks, 31.09

EqualBytesLongUnrolled:637个时钟节拍,31.09

P/Invoke memcmp: 369 ticks, 53.67

P/Invoke memcmp: 369刻度，53.67。

Tested in linqpad, 1000000 bytes identical arrays (worst case scenario), 500 iterations each.

测试在linqpad, 1000000字节相同的数组(最坏的情况场景)，500个迭代每个。

#17

I thought about block-transfer acceleration methods built into many graphics cards. But then you would have to copy over all the data byte-wise, so this doesn't help you much if you don't want to implement a whole portion of your logic in unmanaged and hardware-dependent code...

我想到了许多图形卡中内置的块传输加速方法。但是，如果你不想在非托管和硬件相关的代码中实现你的全部逻辑，那么你将不得不在字节的基础上复制所有的数据，所以这对你没有多大帮助。

Another way of optimization similar to the approach shown above would be to store as much of your data as possible in a long[] rather than a byte[] right from the start, for example if you are reading it sequentially from a binary file, or if you use a memory mapped file, read in data as long[] or single long values. Then, your comparison loop will only need 1/8th of the number of iterations it would have to do for a byte[] containing the same amount of data. It is a matter of when and how often you need to compare vs. when and how often you need to access the data in a byte-by-byte manner, e.g. to use it in an API call as a parameter in a method that expects a byte[]. In the end, you only can tell if you really know the use case...

优化类似于上面所示的方法的另一种方式是将尽可能多的数据存储在一个长[][],而不是一个字节从一开始,例如,如果你正在阅读顺序从二进制文件,或者如果你使用一个内存映射文件,读取的数据作为长[]或单一值。然后，您的比较循环将只需要1/8的迭代次数，它需要为一个包含相同数量数据的字节进行。这是一个什么时候、什么时候需要进行比较的问题，您需要以字节字节的方式访问数据，例如在API调用中使用它作为一个参数，而这个方法需要一个字节[]。最后，您只能知道您是否真正了解用例……

#18

I did some measurements using attached program .net 4.7 release build without the debugger attached. I think people have been using the wrong metric since what you are about if you care about speed here is how long it takes to figure out if two byte arrays are equal. i.e. throughput in bytes.

我做了一些测量，使用了附加程序。net 4.7版本，没有附加调试器。我认为人们一直在使用错误的度量，因为如果你关心速度的话，这就是计算两个字节数组是否相等所需的时间。即吞吐量以字节为单位。

StructuralComparison :       2838.8 MiB/s
for                  :   30553811.0 MiB/s
ToUInt32             :   23864406.8 MiB/s
ToUInt64             :    5526595.7 MiB/s
memcmp               : 1848977556.1 MiB/s

As you can see, there's no better way than memcmp and it's orders of magnitude faster. A simple for loop is the second best option. And it still boggles my mind why Microsoft cannot simply include a Buffer.Compare method.

正如你所看到的，没有比memcmp更好的方法了，它的数量级更快。一个简单的for循环是第二个最佳选项。这仍然让我感到困惑，为什么微软不能简单地包含一个缓冲区。比较方法。

[Program.cs]:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading.Tasks;

namespace memcmp
{
    class Program
    {
        static byte[] TestVector(int size)
        {
            var data = new byte[size];
            using (var rng = new System.Security.Cryptography.RNGCryptoServiceProvider())
            {
                rng.GetBytes(data);
            }
            return data;
        }

        static TimeSpan Measure(string testCase, TimeSpan offset, Action action, bool ignore = false)
        {
            var t = Stopwatch.StartNew();
            var n = 0L;
            while (t.Elapsed < TimeSpan.FromSeconds(10))
            {
                action();
                n++;
            }
            var elapsed = t.Elapsed - offset;
            if (!ignore)
            {
                Console.WriteLine($"{testCase,-16} : {n / elapsed.TotalSeconds,16:0.0} MiB/s");
            }
            return elapsed;
        }

        [DllImport("msvcrt.dll", CallingConvention = CallingConvention.Cdecl)]
        static extern int memcmp(byte[] b1, byte[] b2, long count);

        static void Main(string[] args)
        {
            // how quickly can we establish if two sequences of bytes are equal?

            // note that we are testing the speed of different comparsion methods

            var a = TestVector(1024 * 1024); // 1 MiB
            var b = (byte[])a.Clone();

            var offset = Measure("offset", new TimeSpan(), () => { return; }, ignore: true);

            Measure("StructuralComparison", offset, () =>
            {
                StructuralComparisons.StructuralEqualityComparer.Equals(a, b);
            });

            Measure("for", offset, () =>
            {
                for (int i = 0; i < a.Length; i++)
                {
                    if (a[i] != b[i]) break;
                }
            });

            Measure("ToUInt32", offset, () =>
            {
                for (int i = 0; i < a.Length; i += 4)
                {
                    if (BitConverter.ToUInt32(a, i) != BitConverter.ToUInt32(b, i)) break;
                }
            });

            Measure("ToUInt64", offset, () =>
            {
                for (int i = 0; i < a.Length; i += 8)
                {
                    if (BitConverter.ToUInt64(a, i) != BitConverter.ToUInt64(b, i)) break;
                }
            });

            Measure("memcmp", offset, () =>
            {
                memcmp(a, b, a.Length);
            });
        }
    }
}

#19

Sorry, if you're looking for a managed way you're already doing it correctly and to my knowledge there's no built in method in the BCL for doing this.

不好意思，如果你正在寻找一种有管理的方法，你已经正确地做了，而且据我所知，在BCL中没有内置的方法来做这个。

You should add some initial null checks and then just reuse it as if it where in BCL.

您应该添加一些初始的空检查，然后像在BCL中那样重用它。

#20

This is almost certainly much slower than any other version given here, but it was fun to write.

这几乎肯定比这里给出的其他版本慢得多，但写起来很有趣。

static bool ByteArrayEquals(byte[] a1, byte[] a2) 
{
    return a1.Zip(a2, (l, r) => l == r).All(x => x);
}

#21

Since many of the fancy solutions above don't work with UWP and because I love Linq and functional approaches I pressent you my version to this problem. To escape the comparison when the first difference occures, I chose .FirstOrDefault()

由于上面的许多漂亮的解决方案都不能与UWP一起工作，而且我喜欢Linq和功能方法，所以我将我的版本提交给了这个问题。为了避免第一次差异发生时的比较，我选择了.FirstOrDefault()

public static bool CompareByteArrays(byte[] ba0, byte[] ba1) =>
    !(ba0.Length != ba1.Length || Enumerable.Range(1,ba0.Length)
        .FirstOrDefault(n => ba0[n] != ba1[n]) > 0);

#22

I settled on a solution inspired by the EqualBytesLongUnrolled method posted by ArekBulski with an additional optimization. In my instance, array differences in arrays tend to be near the tail of the arrays. In testing, I found that when this is the case for large arrays, being able to compare array elements in reverse order gives this solution a huge performance gain over the memcmp based solution. Here is that solution:

我解决了一个由ArekBulski发布的equalbyteslongun方法启发的解决方案，并附加了一个优化。在我的实例中，数组中的数组差异趋向于靠近数组的尾部。在测试中，我发现当这是大型数组的情况时，能够以相反的顺序比较数组元素会使这个解决方案在基于memcmp的解决方案中获得巨大的性能提升。是,解决方案:

public enum CompareDirection { Forward, Backward }

private static unsafe bool UnsafeEquals(byte[] a, byte[] b, CompareDirection direction = CompareDirection.Forward)
{
    // returns when a and b are same array or both null
    if (a == b) return true;

    // if either is null or different lengths, can't be equal
    if (a == null || b == null || a.Length != b.Length)
        return false;

    const int UNROLLED = 16;                // count of longs 'unrolled' in optimization
    int size = sizeof(long) * UNROLLED;     // 128 bytes (min size for 'unrolled' optimization)
    int len = a.Length;
    int n = len / size;         // count of full 128 byte segments
    int r = len % size;         // count of remaining 'unoptimized' bytes

    // pin the arrays and access them via pointers
    fixed (byte* pb_a = a, pb_b = b)
    {
        if (r > 0 && direction == CompareDirection.Backward)
        {
            byte* pa = pb_a + len - 1;
            byte* pb = pb_b + len - 1;
            byte* phead = pb_a + len - r;
            while(pa >= phead)
            {
                if (*pa != *pb) return false;
                pa--;
                pb--;
            }
        }

        if (n > 0)
        {
            int nOffset = n * size;
            if (direction == CompareDirection.Forward)
            {
                long* pa = (long*)pb_a;
                long* pb = (long*)pb_b;
                long* ptail = (long*)(pb_a + nOffset);
                while (pa < ptail)
                {
                    if (*(pa + 0) != *(pb + 0) || *(pa + 1) != *(pb + 1) ||
                        *(pa + 2) != *(pb + 2) || *(pa + 3) != *(pb + 3) ||
                        *(pa + 4) != *(pb + 4) || *(pa + 5) != *(pb + 5) ||
                        *(pa + 6) != *(pb + 6) || *(pa + 7) != *(pb + 7) ||
                        *(pa + 8) != *(pb + 8) || *(pa + 9) != *(pb + 9) ||
                        *(pa + 10) != *(pb + 10) || *(pa + 11) != *(pb + 11) ||
                        *(pa + 12) != *(pb + 12) || *(pa + 13) != *(pb + 13) ||
                        *(pa + 14) != *(pb + 14) || *(pa + 15) != *(pb + 15)
                    )
                    {
                        return false;
                    }
                    pa += UNROLLED;
                    pb += UNROLLED;
                }
            }
            else
            {
                long* pa = (long*)(pb_a + nOffset);
                long* pb = (long*)(pb_b + nOffset);
                long* phead = (long*)pb_a;
                while (phead < pa)
                {
                    if (*(pa - 1) != *(pb - 1) || *(pa - 2) != *(pb - 2) ||
                        *(pa - 3) != *(pb - 3) || *(pa - 4) != *(pb - 4) ||
                        *(pa - 5) != *(pb - 5) || *(pa - 6) != *(pb - 6) ||
                        *(pa - 7) != *(pb - 7) || *(pa - 8) != *(pb - 8) ||
                        *(pa - 9) != *(pb - 9) || *(pa - 10) != *(pb - 10) ||
                        *(pa - 11) != *(pb - 11) || *(pa - 12) != *(pb - 12) ||
                        *(pa - 13) != *(pb - 13) || *(pa - 14) != *(pb - 14) ||
                        *(pa - 15) != *(pb - 15) || *(pa - 16) != *(pb - 16)
                    )
                    {
                        return false;
                    }
                    pa -= UNROLLED;
                    pb -= UNROLLED;
                }
            }
        }

        if (r > 0 && direction == CompareDirection.Forward)
        {
            byte* pa = pb_a + len - r;
            byte* pb = pb_b + len - r;
            byte* ptail = pb_a + len;
            while(pa < ptail)
            {
                if (*pa != *pb) return false;
                pa++;
                pb++;
            }
        }
    }

    return true;
}

#23

-1

Use SequenceEquals for this to comparison.

使用SequenceEquals来进行比较。

#24

-1

The short answer is this:

简短的回答是:

    public bool Compare(byte[] b1, byte[] b2)
    {
        return Encoding.ASCII.GetString(b1) == Encoding.ASCII.GetString(b2);
    }

In such a way you can use the optimized .NET string compare to make a byte array compare without the need to write unsafe code. This is how it is done in the background:

在这种方式下，您可以使用优化的. net字符串比较，使一个字节数组比较，而不需要编写不安全的代码。这就是它在后台的做法:

private unsafe static bool EqualsHelper(String strA, String strB)
{
    Contract.Requires(strA != null);
    Contract.Requires(strB != null);
    Contract.Requires(strA.Length == strB.Length);

    int length = strA.Length;

    fixed (char* ap = &strA.m_firstChar) fixed (char* bp = &strB.m_firstChar)
    {
        char* a = ap;
        char* b = bp;

        // Unroll the loop

        #if AMD64
            // For the AMD64 bit platform we unroll by 12 and
            // check three qwords at a time. This is less code
            // than the 32 bit case and is shorter
            // pathlength.

            while (length >= 12)
            {
                if (*(long*)a     != *(long*)b)     return false;
                if (*(long*)(a+4) != *(long*)(b+4)) return false;
                if (*(long*)(a+8) != *(long*)(b+8)) return false;
                a += 12; b += 12; length -= 12;
            }
       #else
           while (length >= 10)
           {
               if (*(int*)a != *(int*)b) return false;
               if (*(int*)(a+2) != *(int*)(b+2)) return false;
               if (*(int*)(a+4) != *(int*)(b+4)) return false;
               if (*(int*)(a+6) != *(int*)(b+6)) return false;
               if (*(int*)(a+8) != *(int*)(b+8)) return false;
               a += 10; b += 10; length -= 10;
           }
       #endif

        // This depends on the fact that the String objects are
        // always zero terminated and that the terminating zero is not included
        // in the length. For odd string sizes, the last compare will include
        // the zero terminator.
        while (length > 0)
        {
            if (*(int*)a != *(int*)b) break;
            a += 2; b += 2; length -= 2;
        }

        return (length <= 0);
    }
}

#25

-1

I have not seen many linq solutions here.

我在这里没有看到很多linq解决方案。

I am not sure of the performance implications, however I generally stick to linq as rule of thumb and then optimize later if necessary.

我不确定性能的影响，但是我通常坚持linq作为经验法则，然后在必要时进行优化。

public bool CompareTwoArrays(byte[] array1, byte[] array2)
 {
   return !array1.Where((t, i) => t != array2[i]).Any();
 }

Please do note this only works if they are the same size arrays. an extension could look like so

请注意，这只有在它们是相同大小的数组时才有效。扩展可以是这样的。

public bool CompareTwoArrays(byte[] array1, byte[] array2)
 {
   if (array1.Length != array2.Length) return false;
   return !array1.Where((t, i) => t != array2[i]).Any();
 }

#26

-2

If you are looking for a very fast byte array equality comparer, I suggest you take a look at this STSdb Labs article: Byte array equality comparer. It features some of the fastest implementations for byte[] array equality comparing, which are presented, performance tested and summarized.

如果您正在寻找一个非常快速的字节数组相等比较器，我建议您看一下这个STSdb实验室的文章:字节数组相等比较器。它的特点是实现了一些最快速的字节[]数组相等的实现，并对其进行了性能测试和总结。

You can also focus on these implementations:

您还可以关注这些实现:

BigEndianByteArrayComparer - fast byte[] array comparer from left to right (BigEndian) BigEndianByteArrayEqualityComparer - - fast byte[] equality comparer from left to right (BigEndian) LittleEndianByteArrayComparer - fast byte[] array comparer from right to left (LittleEndian) LittleEndianByteArrayEqualityComparer - fast byte[] equality comparer from right to left (LittleEndian)

从左向右(BigEndian)比gendianbytearrayequalitycomparer - fast byte[]相等比较器从左到右(BigEndian)小endianbytearraycomparer - fast byte[]数组比较器从右到左(LittleEndian) LittleEndianByteArrayEqualityComparer - fast byte[]从右到左的相等比较器(LittleEndian)

#27

-5

In case you have a huge byte array, you can compare them by converting them to string.

如果您有一个巨大的字节数组，您可以通过将它们转换为字符串来比较它们。

You can use something like

你可以使用类似的东西。

byte[] b1 = // Your array
byte[] b2 = // Your array
string s1 = Encoding.Default.GetString( b1 );
string s2 = Encoding.Default.GetString( b2 );

I have used this and I have seen a huge performance impact.

我用过这个，我看到了巨大的性能影响。

#1

496