多维数组上的Numpy直方图

时间:2021-05-16 14:54:42

given an np.array of shape (n_days, n_lat, n_lon), I'd like to compute a histogram with fixed bins for each lat-lon cell (ie the distribution of daily values).

给定np.array的形状(n_days,n_lat,n_lon),我想计算每个lat-lon单元的固定箱的直方图(即每日值的分布)。

A simple solution to the problem is to loop over the cells and invoke np.histogram for each cell::

解决这个问题的一个简单方法是循环遍历单元格并为每个单元格调用np.histogram ::

bins = np.linspace(0, 1.0, 10)
B = np.rand(n_days, n_lat, n_lon)
H = np.zeros((n_bins, n_lat, n_lon), dtype=np.int32)
for lat in range(n_lat):
    for lon in range(n_lon):
        H[:, lat, lon] = np.histogram(A[:, lat, lon], bins=bins)[0]
# note: code not tested

but this is quite slow. Is there a more efficient solution that does not involve a loop?

但这很慢。是否有一个更有效的解决方案,不涉及循环?

I looked into np.searchsorted to get the bin indices for each value in B and then use fancy indexing to update H::

我查看了np.searchsorted来获取B中每个值的bin索引,然后使用花式索引来更新H ::

bin_indices = bins.searchsorted(B)
H[bin_indices.ravel(), idx[0], idx[1]] += 1  # where idx is a index grid given by np.indices
# note: code not tested

but this does not work because the in-place add operator (+=) doesn't seem to support multiple updates of the same cell.

但这不起作用,因为就地添加运算符(+ =)似乎不支持同一单元格的多个更新。

thx, Peter

2 个解决方案

#1


3  

You can use numpy.apply_along_axis to eliminate the loop.

您可以使用numpy.apply_along_axis来消除循环。

hist, bin_edges = apply_along_axis(lambda x: histogram(x, bins=bins), 0, B)

#2


0  

Maybe this works?:

也许这有用吗?:

import numpy as np
n_days=31
n_lat=10
n_lon=10
n_bins=10
bins = np.linspace(0, 1.0, n_bins)
B = np.random.rand(n_days, n_lat, n_lon)


# flatten to 1D
C=np.reshape(B,n_days*n_lat*n_lon)
# use digitize to get the index of the bin to which the numbers belong
D=np.digitize(C,bins)-1
# reshape the results back to the original shape
result=np.reshape(D,(n_days, n_lat, n_lon))

#1


3  

You can use numpy.apply_along_axis to eliminate the loop.

您可以使用numpy.apply_along_axis来消除循环。

hist, bin_edges = apply_along_axis(lambda x: histogram(x, bins=bins), 0, B)

#2


0  

Maybe this works?:

也许这有用吗?:

import numpy as np
n_days=31
n_lat=10
n_lon=10
n_bins=10
bins = np.linspace(0, 1.0, n_bins)
B = np.random.rand(n_days, n_lat, n_lon)


# flatten to 1D
C=np.reshape(B,n_days*n_lat*n_lon)
# use digitize to get the index of the bin to which the numbers belong
D=np.digitize(C,bins)-1
# reshape the results back to the original shape
result=np.reshape(D,(n_days, n_lat, n_lon))