python中加权直方图的错误

时间:2021-09-02 14:58:48

I want to calculate the error on a bin height by taking the square root of the sum of the weights squared (sumw2) in that bin (poission error). Is there any way to get the sum of weights (sumw) and/or sumw2 when histogramming data with either matplotlib or numpy (or any other library for that matter)?

我想通过获取该bin中权重之和(sumw2)的平方根(poission error)来计算bin高度的误差。当使用matplotlib或numpy(或任何其他库)对数据进行直方图编码时,有没有办法得到权重(sumw)和/或sumw2的总和?

Let's say I have some data in a numpy array x and some weights w in another numpy array, to get the histogram I would either so

假设我在一个numpy数组x中有一些数据,而在另一个numpy数组中有一些权重w来得到直方图我要么

n, bins, patches = pyplot.hist(x,weights=w)

or

n, bins = numpy.histogram(x,weights=w)

In both cases I have no clue which entries of w belong to which bin right?

在这两种情况下,我都不知道哪些条目属于哪个bin对吧?

Edit: Currently I'm using YODA to do this. The disadvantage from my point of view is that YODA histograms can only be filled with one data point at a time.

编辑:目前我正在使用YODA来做到这一点。从我的观点来看,缺点是YODA直方图一次只能填充一个数据点。

1 个解决方案

#1


1  

According to numpy documentation, weights

根据numpy文档,权重

An array of weights, of the same shape as a. Each value in a only contributes its associated weight towards the bin count (instead of 1). If density is True, the weights are normalized, so that the integral of the density over the range remains 1.

一系列重量,形状与a相同。 a中的每个值仅将其相关权重贡献给bin计数(而不是1)。如果密度为True,则将权重归一化,以使密度在整个范围内的积分保持为1。

That means that each value in w should be associated with a value in x. If you'd like to weight bins and plot them, you can first find bins' values, multiply them by weights and finally plot them using bar.

这意味着w中的每个值都应该与x中的值相关联。如果您想对重量箱进行加权并绘制它们,您可以先找到箱子的值,然后将它们乘以权重,最后使用条形图绘制它们。

val, pos = np.histogram(np.arange(1000))
w_val = val * w
plt.bar(pos[1:], w_val)


Update from the comment:

Ahh, sorry, it seems that I didn't understand the initial question. Actually, you can use pos to find cells related to each bin and calculate your weight function using these information.

啊,对不起,我似乎不明白最初的问题。实际上,您可以使用pos查找与每个bin相关的单元格,并使用这些信息计算您的权重函数。

for left, right in zip(pos, pos[1:): 
    ix = np.where((x >= left) & (x <= right))[0] 
    sumw2 = np.sum(w[ix] ** 2) 

#1


1  

According to numpy documentation, weights

根据numpy文档,权重

An array of weights, of the same shape as a. Each value in a only contributes its associated weight towards the bin count (instead of 1). If density is True, the weights are normalized, so that the integral of the density over the range remains 1.

一系列重量,形状与a相同。 a中的每个值仅将其相关权重贡献给bin计数(而不是1)。如果密度为True,则将权重归一化,以使密度在整个范围内的积分保持为1。

That means that each value in w should be associated with a value in x. If you'd like to weight bins and plot them, you can first find bins' values, multiply them by weights and finally plot them using bar.

这意味着w中的每个值都应该与x中的值相关联。如果您想对重量箱进行加权并绘制它们,您可以先找到箱子的值,然后将它们乘以权重,最后使用条形图绘制它们。

val, pos = np.histogram(np.arange(1000))
w_val = val * w
plt.bar(pos[1:], w_val)


Update from the comment:

Ahh, sorry, it seems that I didn't understand the initial question. Actually, you can use pos to find cells related to each bin and calculate your weight function using these information.

啊,对不起,我似乎不明白最初的问题。实际上,您可以使用pos查找与每个bin相关的单元格,并使用这些信息计算您的权重函数。

for left, right in zip(pos, pos[1:): 
    ix = np.where((x >= left) & (x <= right))[0] 
    sumw2 = np.sum(w[ix] ** 2)