Python:快速计算几个(相同长度)列表的平均值?

时间:2021-08-11 21:22:15

Is there a simple way to calculate the mean of several (same length) lists in Python? Say, I have [[1, 2, 3], [5, 6, 7]], and want to obtain [3,4,5]. This is to be doing 100000 times, so want it to be fast.

有没有一种简单的方法来计算Python中几个(相同长度)列表的平均值?说,我[[1,2,3],[5,6,7]],并希望获得[3,4,5]。这要做10万次,所以要快点。

3 个解决方案

#1


16  

In case you're using numpy (which seems to be more appropriate here):

如果你正在使用numpy(这似乎更合适):

>>> import numpy as np
>>> data = np.array([[1, 2, 3], [5, 6, 7]])
>>> np.average(data, axis=0)
array([ 3.,  4.,  5.])

#2


3  

In [6]: l = [[1, 2, 3], [5, 6, 7]]

In [7]: [(x+y)/2 for x,y in zip(*l)]
Out[7]: [3, 4, 5]

(You'll need to decide whether you want integer or floating-point maths, and which kind of division to use.)

(您需要决定是否需要整数或浮点数学,以及要使用哪种除法。)

On my computer, the above takes 1.24us:

在我的电脑上,上面需要1.24us:

In [11]: %timeit [(x+y)/2 for x,y in zip(*l)]
1000000 loops, best of 3: 1.24 us per loop

Thus processing 100,000 inputs would take 0.124s.

因此,处理100,000个输入将需要0.124秒。

Interestingly, NumPy arrays are slower on such small inputs:

有趣的是,NumPy阵列在如此小的输入上速度较慢:

In [27]: In [21]: a = np.array(l)

In [28]: %timeit (a[0] + a[1]) / 2
100000 loops, best of 3: 5.3 us per loop

In [29]: %timeit np.average(a, axis=0)
100000 loops, best of 3: 12.7 us per loop

If the inputs get bigger, the relative timings will no doubt change.

如果输入变大,相对时间无疑会改变。

#3


0  

Extending NPEs answer, for a list containing n sublists which you want to average, use this (a numpy solution might be faster, but mine uses only built-ins):

扩展NPE答案,对于包含您想要平均的n个子列表的列表,使用它(一个numpy解决方案可能更快,但我只使用内置插件):

def average(l):
    llen = len(l)
    def divide(x): return x / llen
    return map(divide, map(sum, zip(*l)))

This sums up all sublists and then divides the result by the number of sublists, producing the average. You could inline the len computation and turn divide into a lambda like lambda x: x / len(l), but using an explicit function and pre-computing the length should be a bit faster.

这会将所有子列表相加,然后将结果除以子列表的数量,从而产生平均值。你可以内联len计算并将除法转换为像lambda x:x / len(l)这样的lambda,但是使用显式函数并预先计算长度应该快一点。

#1


16  

In case you're using numpy (which seems to be more appropriate here):

如果你正在使用numpy(这似乎更合适):

>>> import numpy as np
>>> data = np.array([[1, 2, 3], [5, 6, 7]])
>>> np.average(data, axis=0)
array([ 3.,  4.,  5.])

#2


3  

In [6]: l = [[1, 2, 3], [5, 6, 7]]

In [7]: [(x+y)/2 for x,y in zip(*l)]
Out[7]: [3, 4, 5]

(You'll need to decide whether you want integer or floating-point maths, and which kind of division to use.)

(您需要决定是否需要整数或浮点数学,以及要使用哪种除法。)

On my computer, the above takes 1.24us:

在我的电脑上,上面需要1.24us:

In [11]: %timeit [(x+y)/2 for x,y in zip(*l)]
1000000 loops, best of 3: 1.24 us per loop

Thus processing 100,000 inputs would take 0.124s.

因此,处理100,000个输入将需要0.124秒。

Interestingly, NumPy arrays are slower on such small inputs:

有趣的是,NumPy阵列在如此小的输入上速度较慢:

In [27]: In [21]: a = np.array(l)

In [28]: %timeit (a[0] + a[1]) / 2
100000 loops, best of 3: 5.3 us per loop

In [29]: %timeit np.average(a, axis=0)
100000 loops, best of 3: 12.7 us per loop

If the inputs get bigger, the relative timings will no doubt change.

如果输入变大,相对时间无疑会改变。

#3


0  

Extending NPEs answer, for a list containing n sublists which you want to average, use this (a numpy solution might be faster, but mine uses only built-ins):

扩展NPE答案,对于包含您想要平均的n个子列表的列表,使用它(一个numpy解决方案可能更快,但我只使用内置插件):

def average(l):
    llen = len(l)
    def divide(x): return x / llen
    return map(divide, map(sum, zip(*l)))

This sums up all sublists and then divides the result by the number of sublists, producing the average. You could inline the len computation and turn divide into a lambda like lambda x: x / len(l), but using an explicit function and pre-computing the length should be a bit faster.

这会将所有子列表相加,然后将结果除以子列表的数量,从而产生平均值。你可以内联len计算并将除法转换为像lambda x:x / len(l)这样的lambda,但是使用显式函数并预先计算长度应该快一点。