DataFrame:
DataFrame:
c_os_family_ss c_os_major_is l_customer_id_i
0 Windows 7 90418
1 Windows 7 90418
2 Windows 7 90418
Code:
代码:
print df
for name, group in df.groupby('l_customer_id_i').agg(lambda x: ','.join(x)):
print name
print group
I'm trying to just loop over the aggregated data, but I get the error:
我只是对聚合的数据进行循环,但是我得到了错误:
ValueError: too many values to unpack
ValueError:太多的值无法解包
@EdChum, here's the expected output:
@EdChum,这是预期输出:
c_os_family_ss \
l_customer_id_i
131572 Windows 7,Windows 7,Windows 7,Windows 7,Window...
135467 Windows 7,Windows 7,Windows 7,Windows 7,Window...
c_os_major_is
l_customer_id_i
131572 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...
135467 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...
The output is not the problem, I wish to loop over every group.
输出不是问题,我希望对每个组进行循环。
3 个解决方案
#1
79
df.groupby('l_customer_id_i').agg(lambda x: ','.join(x))
does already return a dataframe, so you cannot loop over the groups anymore.
df.groupby(“l_customer_id_i”)。agg(lambda x: ','.join(x))已经返回一个dataframe,因此您不能再循环组。
In general:
一般来说:
-
df.groupby(...)
returns aGroupBy
object (a DataFrameGroupBy or SeriesGroupBy), and with this, you can iterate through the groups (as explained in the docs here). You can do something like:GroupBy(…)返回GroupBy对象(DataFrameGroupBy或SeriesGroupBy),使用此方法,您可以遍历组(如本文文档中所述)。你可以这样做:
grouped = df.groupby('A') for name, group in grouped: ...
-
When you apply a function on the groupby, in your example
df.groupby(...).agg(...)
(but this can also betransform
,apply
,mean
, ...), you combine the result of applying the function to the different groups together in one dataframe (the apply and combine step of the 'split-apply-combine' paradigm of groupby). So the result of this will always be again a DataFrame (or a Series depending on the applied function).在您的示例df.groupby(…).agg(…)中,当您对groupby应用一个函数时(但这也可以是转换、应用、平均、…),您将将将该函数应用于不同组的结果合并到一个dataframe中(groupby的“分割-应用-应用-结合”范式的应用和组合步骤)。因此,这样做的结果仍然是一个DataFrame(或一系列,取决于应用的函数)。
#2
6
You can iterate over the index values if your dataframe has already been created.
如果已经创建了dataframe,则可以迭代索引值。
df = df.groupby('l_customer_id_i').agg(lambda x: ','.join(x))
for name in df.index:
print name
print df.loc[name]
#3
#1
79
df.groupby('l_customer_id_i').agg(lambda x: ','.join(x))
does already return a dataframe, so you cannot loop over the groups anymore.
df.groupby(“l_customer_id_i”)。agg(lambda x: ','.join(x))已经返回一个dataframe,因此您不能再循环组。
In general:
一般来说:
-
df.groupby(...)
returns aGroupBy
object (a DataFrameGroupBy or SeriesGroupBy), and with this, you can iterate through the groups (as explained in the docs here). You can do something like:GroupBy(…)返回GroupBy对象(DataFrameGroupBy或SeriesGroupBy),使用此方法,您可以遍历组(如本文文档中所述)。你可以这样做:
grouped = df.groupby('A') for name, group in grouped: ...
-
When you apply a function on the groupby, in your example
df.groupby(...).agg(...)
(but this can also betransform
,apply
,mean
, ...), you combine the result of applying the function to the different groups together in one dataframe (the apply and combine step of the 'split-apply-combine' paradigm of groupby). So the result of this will always be again a DataFrame (or a Series depending on the applied function).在您的示例df.groupby(…).agg(…)中,当您对groupby应用一个函数时(但这也可以是转换、应用、平均、…),您将将将该函数应用于不同组的结果合并到一个dataframe中(groupby的“分割-应用-应用-结合”范式的应用和组合步骤)。因此,这样做的结果仍然是一个DataFrame(或一系列,取决于应用的函数)。
#2
6
You can iterate over the index values if your dataframe has already been created.
如果已经创建了dataframe,则可以迭代索引值。
df = df.groupby('l_customer_id_i').agg(lambda x: ','.join(x))
for name in df.index:
print name
print df.loc[name]