Pandas将列添加到数据帧[重复]

时间:2021-04-02 22:58:19

This question already has an answer here:

这个问题在这里已有答案:

So, I've seen this answer here, which is sensible for functions which return one output. What if my function has multiple outputs?

所以,我在这里看到了这个答案,这对于返回一个输出的函数是明智的。如果我的功能有多个输出怎么办?

More concretely, let's say I am cross-referencing some data on some ID. But when I call certain IDs, it returns multiple matches, which I want to put into different columns.

更具体地说,假设我在某些ID上交叉引用一些数据。但是当我调用某些ID时,它会返回多个匹配项,我想将它们放入不同的列中。

An example of this would be something like the below, where worker 3 has two bosses, 0 and 2, while worker 1 has one boss, 2.

这方面的一个例子如下所示,其中工人3有两个老板,0和2,而工人1有一个老板,2。

Worker_ID Boss_ID
        3       0
        3       2
        1       2

Is it possible to create the second column and populate without first going through, counting the number of matches and creating the relevant number of columns?

是否可以创建第二列并填充而不首先通过计算匹配数并创建相关列数?

EDIT:

I'd like something like this in short-form:

我喜欢这样的简短形式:

Worker_ID  Boss_ID_1 Boss_ID_2   ...as necessary
        3          0         2
        1          2       nan

1 个解决方案

#1


0  

Create a key by using cumcount then we can using pivot

使用cumcount创建一个键,然后我们可以使用pivot

df.assign(key=df.groupby('Worker_ID').cumcount()+1).\
   pivot(index='Worker_ID',columns='key',values='Boss_ID').\
      add_prefix('Boss_ID_')
Out[242]: 
key        Boss_ID_1  Boss_ID_2
Worker_ID                      
1                2.0        NaN
3                0.0        2.0

#1


0  

Create a key by using cumcount then we can using pivot

使用cumcount创建一个键,然后我们可以使用pivot

df.assign(key=df.groupby('Worker_ID').cumcount()+1).\
   pivot(index='Worker_ID',columns='key',values='Boss_ID').\
      add_prefix('Boss_ID_')
Out[242]: 
key        Boss_ID_1  Boss_ID_2
Worker_ID                      
1                2.0        NaN
3                0.0        2.0