So, I have a dataframe containing 3 columns each having 631 rows, so I am highlighting only the unique values under each column.
所以,我有一个包含3列的数据帧,每列有631行,所以我只突出显示每列下的唯一值。
df
Segment Type Nature of Query Q1
PRIME Request 1
BUSINESS Complaint 2
PRIORITY Critical Request 3
4
5
Now, let's say under 'Segment Type' i want to group 'PRIME' with every row of 'NATURE OF QUERY' and 'Q1' and find size, min, max, mean
现在,让我们说在“段类型”下我想用'NATURE OF QUERY'和'Q1'的每一行组合'PRIME'并找到size,min,max,mean
So tried to use groupby func and i got this:
所以尝试使用groupby func,我得到了这个:
df.groupby(['Segment Type','Nature of Query'])['Q1'].agg([pd.np.size,
pd.np.min, pd.np.max, pd.np.mean])
And, i got this:
而且,我得到了这个:
Segment Type Nature of Query size amin amax mean
BUSINESS Request 1 4 4 4.000000
PRIME Complaint 1 5 5 5.000000
Critical Request 3 1 2 1.666667
Request 31 1 5 3.387097
PRIORITY Critical Request 1 4 4 4.000000
Request 3 3 5 4.000000
What i wanted as output:
我想要的输出:
Segment Type Nature of Query size amin amax mean
BUSINESS Request 1 4 4 4.000000
Complaint 1 5 5 5.000000
Critical Request 3 1 2 1.666667
PRIME Complaint 1 5 5 5.000000
Critical Request 3 1 2 1.666667
Request 31 1 5 3.387097
PRIORITY Complaint 1 5 5 5.000000
Critical Request 1 4 4 4.000000
Request 3 3 5 4.000000
Ignore the size, mean, max etc it is calculated wrt Q1. My main problem is with the values of 'Segment Type' and 'Nature of Query'.
忽略它与Q1计算的大小,平均值,最大值等。我的主要问题是“细分类型”和“查询性质”的值。
If any solution possible, please let me know. Thanks!
如果有任何解决方案,请告诉我。谢谢!
2 个解决方案
#1
0
I believe need reindex
created by MultiIndex.from_product
:
我相信需要由MultiIndex.from_product创建的reindex:
df = df.groupby(['Segment Type','Nature of Query'])['Q1'].agg(['size', 'min', 'max', 'mean'])
mux = pd.MultiIndex.from_product(df.index.levels, names=['Segment Type','Nature of Query'])
df = df.reindex(mux, fill_value=0).reset_index()
print (df)
Segment Type Nature of Query size min max mean
0 BUSINESS Complaint 1 2 2 2
1 BUSINESS Critical Request 0 0 0 0
2 BUSINESS Request 0 0 0 0
3 PRIME Complaint 0 0 0 0
4 PRIME Critical Request 0 0 0 0
5 PRIME Request 1 1 1 1
6 PRIORITY Complaint 0 0 0 0
7 PRIORITY Critical Request 3 3 5 4
8 PRIORITY Request 0 0 0 0
#2
0
You could use the pivot table function, see the tutorial here :
您可以使用数据透视表功能,请参阅此处的教程:
#1
0
I believe need reindex
created by MultiIndex.from_product
:
我相信需要由MultiIndex.from_product创建的reindex:
df = df.groupby(['Segment Type','Nature of Query'])['Q1'].agg(['size', 'min', 'max', 'mean'])
mux = pd.MultiIndex.from_product(df.index.levels, names=['Segment Type','Nature of Query'])
df = df.reindex(mux, fill_value=0).reset_index()
print (df)
Segment Type Nature of Query size min max mean
0 BUSINESS Complaint 1 2 2 2
1 BUSINESS Critical Request 0 0 0 0
2 BUSINESS Request 0 0 0 0
3 PRIME Complaint 0 0 0 0
4 PRIME Critical Request 0 0 0 0
5 PRIME Request 1 1 1 1
6 PRIORITY Complaint 0 0 0 0
7 PRIORITY Critical Request 3 3 5 4
8 PRIORITY Request 0 0 0 0
#2
0
You could use the pivot table function, see the tutorial here :
您可以使用数据透视表功能,请参阅此处的教程: