I want to group data from a dataframe using dataframe and I want to compute the histogram of the grouped data : This is my dataframe :
我想使用数据框对数据框中的数据进行分组,我想计算分组数据的直方图:这是我的数据框:
indicator
key
14 1
14 2
14 3
15 1
16 2
16 5
16 6
17 1
18 3
And I want to get this result using groupby :
我想使用groupby得到这个结果:
indicator
key
14 1,2,3
15 1
16 2,5,6
17 1
18 3
and then compute the histogram of every key
然后计算每个键的直方图
1 个解决方案
#1
1
numpy.histogram
cannot deal with the array in an array. You need to format your data like this.
numpy.histogram无法处理数组中的数组。您需要像这样格式化数据。
import numpy as np
import pandas as pd
dataf = pd.DataFrame()
dataf['key'] = range(14,25)
dataf['indicator'] = [1,1,2,1,3,4,7,15,23,43,67]
dataf.loc[11] = [14,2]
dataf.loc[12] = [14,3]
dataf.loc[13] = [16,5]
dataf.loc[14] = [16,6]
Because there is no raw data provided, I can only assume data maybe can be reformatted like this.
由于没有提供原始数据,我只能假设数据可以像这样重新格式化。
In [30]: dataf
Out[30]:
key indicator
0 14 1
1 15 1
2 16 2
3 17 1
4 18 3
5 19 4
6 20 7
7 21 15
8 22 23
9 23 43
10 24 67
11 14 2
12 14 3
13 16 5
14 16 6
numpy.histogram
already handled the groupby
concept so you don't need to do groupby
function in DataFrame
. You just need to do np.histogram(dff['indicator'])
numpy.histogram已经处理了groupby概念,因此您不需要在DataFrame中执行groupby函数。你只需要做np.histogram(dff ['indicator'])
FYI, if you want to plot a histogram, you can also use DataFrame.hist()
仅供参考,如果你想绘制直方图,你也可以使用DataFrame.hist()
dataf.indicator.hist()
import matplotlib.pyplot as plt
plt.savefig('test.png')
#1
1
numpy.histogram
cannot deal with the array in an array. You need to format your data like this.
numpy.histogram无法处理数组中的数组。您需要像这样格式化数据。
import numpy as np
import pandas as pd
dataf = pd.DataFrame()
dataf['key'] = range(14,25)
dataf['indicator'] = [1,1,2,1,3,4,7,15,23,43,67]
dataf.loc[11] = [14,2]
dataf.loc[12] = [14,3]
dataf.loc[13] = [16,5]
dataf.loc[14] = [16,6]
Because there is no raw data provided, I can only assume data maybe can be reformatted like this.
由于没有提供原始数据,我只能假设数据可以像这样重新格式化。
In [30]: dataf
Out[30]:
key indicator
0 14 1
1 15 1
2 16 2
3 17 1
4 18 3
5 19 4
6 20 7
7 21 15
8 22 23
9 23 43
10 24 67
11 14 2
12 14 3
13 16 5
14 16 6
numpy.histogram
already handled the groupby
concept so you don't need to do groupby
function in DataFrame
. You just need to do np.histogram(dff['indicator'])
numpy.histogram已经处理了groupby概念,因此您不需要在DataFrame中执行groupby函数。你只需要做np.histogram(dff ['indicator'])
FYI, if you want to plot a histogram, you can also use DataFrame.hist()
仅供参考,如果你想绘制直方图,你也可以使用DataFrame.hist()
dataf.indicator.hist()
import matplotlib.pyplot as plt
plt.savefig('test.png')