从多个数字数组创建平均数组的最快方法

时间:2022-03-11 12:56:12

I have multiple input arrays of Float all of equal size and a single dimension. I want to create a single output Array that contains the average of all the input arrays.

我有多个Float输入数组,大小相同,尺寸相同。我想创建一个包含所有输入数组平均值的输出数组。

e.g.

例如

//input arrays
float[] array1 = new float[] { 1, 1, 1, 1 };
float[] array2 = new float[] { 2, 2, 2, 2 };
float[] array3 = new float[] { 3, 3, 3, 3 };
float[] array4 = new float[] { 4, 4, 4, 4 };

//the output should be
float[2.5, 2.5, 2.5, 2.5]

I would also like to calculate the standard deviation of the input arrays to.

我还想计算输入数组的标准偏差。

What is the fastest approach to do this task?

执行此任务的最快方法是什么?

Thanks in advance. Pete

提前致谢。皮特

4 个解决方案

#1


4  

The fastest way in terms of performance, unless you'd like to unroll the for loop is

除了你想要展开for循环之外,性能方面最快的方法是

float[] sums = new float[4];

for(int i = 0; i < 4; i++)
{
    sums[i] = (array1[i]+ array2[i] + array3[i] + array4[i])/4;
}

#2


7  

LINQ to the rescue

This anwer details how to use LINQ to achieve the goal, with maximum reusability and versatility as the major objective.

本文详细介绍了如何使用LINQ实现目标,以最大的可重用性和多功能性作为主要目标。

Take 1 (package LINQ into a method for convenience)

取1(为方便起见,将LINQ打包成方法)

Take this method:

采取这种方法:

float[] ArrayAverage(params float[][] arrays)
{
    // If you want to check that all arrays are the same size, something
    // like this is convenient:
    // var arrayLength = arrays.Select(a => a.Length).Distinct().Single();

    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => arrays.Select(a => a.Skip(i).First()).Average())
               .ToArray();
}

It works by taking the range [0..arrays.Length-1] and for each number i in the range it calculates the average of the ith element of each array. It can be used very conveniently:

它通过取范围[0..arrays.Length-1]来工作,并且对于范围中的每个数字i,它计算每个数组的第i个元素的平均值。它可以非常方便地使用:

float[] array1 = new float[] { 1, 1, 1, 1 };
float[] array2 = new float[] { 2, 2, 2, 2 };
float[] array3 = new float[] { 3, 3, 3, 3 };
float[] array4 = new float[] { 4, 4, 4, 4 };

var averages = ArrayAverage(array1, array2, array3, array4);

This can already be used on any number of arrays without modification. But you can go one more step and do something more general.

这可以在任何数量的阵列上使用而无需修改。但你可以再迈一步,做一些更通用的事情。

Take 2 (generalizing for any aggregate function)

取2(对任何聚合函数进行推广)

float[] ArrayAggregate(Func<IEnumerable<float>, float> aggregate, params float[][] arrays)
{
    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => aggregate(arrays.Select(a => a.Skip(i).First())))
               .ToArray();
}

This can be used to calculate any aggregate function:

这可用于计算任何聚合函数:

var output = ArrayAggregate(Enumerable.Average, array1, array2, array3, array4);

Instead of Enumerable.Average you can substitute any method, extension method, or anonymous function -- which is useful, as there's no built-in standard deviation aggregate function and also this way the ArrayAggregate function is very versatile. But we can still do better.

您可以替换任何方法,扩展方法或匿名函数,而不是Enumerable.Average,因为没有内置的标准差聚合函数,而且ArrayAggregate函数也非常通用。但我们仍然可以做得更好。

Take 3 (generalizing for any aggregate function and any type of array)

取3(对任何聚合函数和任何类型的数组进行推广)

We can also make a generic version that works with any built-in type:

我们还可以制作适用于任何内置类型的通用版本:

T[] ArrayAggregate<T>(Func<IEnumerable<T>, T> aggregate, params T[][] arrays)
{
    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => aggregate(arrays.Select(a => a.Skip(i).First())))
               .ToArray();
}

As you can probably tell, this is not the fastest code to do the job. If your program spends all day calculating averages, use something more close to the metal. However, if you want reusability and versatility I don't think you can do much better than the above.

您可能已经知道,这不是执行此工作的最快代码。如果您的程序花费全天计算平均值,请使用更接近金属的东西。但是,如果你想要可重用性和多功能性,我认为你不能比上面做得更好。

#3


0  

  static void Main()
  {
     float[] array1 = new float[] { 1, 1, 1, 1 };
     float[] array2 = new float[] { 2, 2, 2, 2 };
     float[] array3 = new float[] { 3, 3, 3, 3 };
     float[] array4 = new float[] { 4, 4, 4, 4 };  
     float[] avg = CrossAverage (array1, array2, array3, array4);
     Console.WriteLine (string.Join ("|", avg.Select(f => f.ToString ()).ToArray()));
  }

  private static float[] CrossAverage (params float [][] arrays)
  {
     int [] count = new int [arrays[0].Length];
     float [] sum = new float [arrays[0].Length];
     for (int j = 0; j < arrays.Length; j++)
     {
        for (int i = 0; i < count.Length; i++)
        {
           count[i] ++;
           sum[i] += arrays[j][i];
        }
     }
     float [] avg = new float [arrays[0].Length];
     for (int i = 0; i < count.Length; i++)
     {
        avg[i] = sum[i] / count[i];
     }
     return avg;
  }

Don't forget bounds checking and divide by 0 checking.

不要忘记边界检查并除以0检查。

#4


0  

And for the standard deviation after calculating the averages (into the sums array):

对于计算平均值后的标准差(进入sums数组):

// std dev
float[] stddevs = new float[4];

for (int i = 0; i < 4; i++)
{
    stddevs[i] += (array1[i] - sums[i]) * (array1[i] - sums[i]);
    stddevs[i] += (array2[i] - sums[i]) * (array2[i] - sums[i]);
    stddevs[i] += (array3[i] - sums[i]) * (array3[i] - sums[i]);
    stddevs[i] += (array4[i] - sums[i]) * (array4[i] - sums[i]);
}

for (int i = 0; i < 4; i++)
    stddevs[i] = (float)Math.Sqrt(stddevs[i]/4);

In general, accessing the array directly rather than using LINQ will be a performance win due to allowing the compiler/JIT to optimize. At the very least, array bounds checks can be eliminated and the overhead of using an enumerator will be avoided.

通常,由于允许编译器/ JIT进行优化,因此直接访问数组而不是使用LINQ将获得性能提升。至少,可以消除数组边界检查,并避免使用枚举器的开销。

#1


4  

The fastest way in terms of performance, unless you'd like to unroll the for loop is

除了你想要展开for循环之外,性能方面最快的方法是

float[] sums = new float[4];

for(int i = 0; i < 4; i++)
{
    sums[i] = (array1[i]+ array2[i] + array3[i] + array4[i])/4;
}

#2


7  

LINQ to the rescue

This anwer details how to use LINQ to achieve the goal, with maximum reusability and versatility as the major objective.

本文详细介绍了如何使用LINQ实现目标,以最大的可重用性和多功能性作为主要目标。

Take 1 (package LINQ into a method for convenience)

取1(为方便起见,将LINQ打包成方法)

Take this method:

采取这种方法:

float[] ArrayAverage(params float[][] arrays)
{
    // If you want to check that all arrays are the same size, something
    // like this is convenient:
    // var arrayLength = arrays.Select(a => a.Length).Distinct().Single();

    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => arrays.Select(a => a.Skip(i).First()).Average())
               .ToArray();
}

It works by taking the range [0..arrays.Length-1] and for each number i in the range it calculates the average of the ith element of each array. It can be used very conveniently:

它通过取范围[0..arrays.Length-1]来工作,并且对于范围中的每个数字i,它计算每个数组的第i个元素的平均值。它可以非常方便地使用:

float[] array1 = new float[] { 1, 1, 1, 1 };
float[] array2 = new float[] { 2, 2, 2, 2 };
float[] array3 = new float[] { 3, 3, 3, 3 };
float[] array4 = new float[] { 4, 4, 4, 4 };

var averages = ArrayAverage(array1, array2, array3, array4);

This can already be used on any number of arrays without modification. But you can go one more step and do something more general.

这可以在任何数量的阵列上使用而无需修改。但你可以再迈一步,做一些更通用的事情。

Take 2 (generalizing for any aggregate function)

取2(对任何聚合函数进行推广)

float[] ArrayAggregate(Func<IEnumerable<float>, float> aggregate, params float[][] arrays)
{
    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => aggregate(arrays.Select(a => a.Skip(i).First())))
               .ToArray();
}

This can be used to calculate any aggregate function:

这可用于计算任何聚合函数:

var output = ArrayAggregate(Enumerable.Average, array1, array2, array3, array4);

Instead of Enumerable.Average you can substitute any method, extension method, or anonymous function -- which is useful, as there's no built-in standard deviation aggregate function and also this way the ArrayAggregate function is very versatile. But we can still do better.

您可以替换任何方法,扩展方法或匿名函数,而不是Enumerable.Average,因为没有内置的标准差聚合函数,而且ArrayAggregate函数也非常通用。但我们仍然可以做得更好。

Take 3 (generalizing for any aggregate function and any type of array)

取3(对任何聚合函数和任何类型的数组进行推广)

We can also make a generic version that works with any built-in type:

我们还可以制作适用于任何内置类型的通用版本:

T[] ArrayAggregate<T>(Func<IEnumerable<T>, T> aggregate, params T[][] arrays)
{
    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => aggregate(arrays.Select(a => a.Skip(i).First())))
               .ToArray();
}

As you can probably tell, this is not the fastest code to do the job. If your program spends all day calculating averages, use something more close to the metal. However, if you want reusability and versatility I don't think you can do much better than the above.

您可能已经知道,这不是执行此工作的最快代码。如果您的程序花费全天计算平均值,请使用更接近金属的东西。但是,如果你想要可重用性和多功能性,我认为你不能比上面做得更好。

#3


0  

  static void Main()
  {
     float[] array1 = new float[] { 1, 1, 1, 1 };
     float[] array2 = new float[] { 2, 2, 2, 2 };
     float[] array3 = new float[] { 3, 3, 3, 3 };
     float[] array4 = new float[] { 4, 4, 4, 4 };  
     float[] avg = CrossAverage (array1, array2, array3, array4);
     Console.WriteLine (string.Join ("|", avg.Select(f => f.ToString ()).ToArray()));
  }

  private static float[] CrossAverage (params float [][] arrays)
  {
     int [] count = new int [arrays[0].Length];
     float [] sum = new float [arrays[0].Length];
     for (int j = 0; j < arrays.Length; j++)
     {
        for (int i = 0; i < count.Length; i++)
        {
           count[i] ++;
           sum[i] += arrays[j][i];
        }
     }
     float [] avg = new float [arrays[0].Length];
     for (int i = 0; i < count.Length; i++)
     {
        avg[i] = sum[i] / count[i];
     }
     return avg;
  }

Don't forget bounds checking and divide by 0 checking.

不要忘记边界检查并除以0检查。

#4


0  

And for the standard deviation after calculating the averages (into the sums array):

对于计算平均值后的标准差(进入sums数组):

// std dev
float[] stddevs = new float[4];

for (int i = 0; i < 4; i++)
{
    stddevs[i] += (array1[i] - sums[i]) * (array1[i] - sums[i]);
    stddevs[i] += (array2[i] - sums[i]) * (array2[i] - sums[i]);
    stddevs[i] += (array3[i] - sums[i]) * (array3[i] - sums[i]);
    stddevs[i] += (array4[i] - sums[i]) * (array4[i] - sums[i]);
}

for (int i = 0; i < 4; i++)
    stddevs[i] = (float)Math.Sqrt(stddevs[i]/4);

In general, accessing the array directly rather than using LINQ will be a performance win due to allowing the compiler/JIT to optimize. At the very least, array bounds checks can be eliminated and the overhead of using an enumerator will be avoided.

通常,由于允许编译器/ JIT进行优化,因此直接访问数组而不是使用LINQ将获得性能提升。至少,可以消除数组边界检查,并避免使用枚举器的开销。