Pandas DataFrame将字典valueassign列应用或映射到MultiIndex值的函数

时间:2021-03-10 21:26:38

I want to map (via dictionary) part of a MultiIndex DataFrame to a column. Is there a way to do that in a single step?

我想将(通过字典)MultiIndex DataFrame的一部分映射到列。有没有办法一步到位?

For example, with the following sample DataFrame:

例如,使用以下示例DataFrame:

i = pd.MultiIndex.from_product([['A','B','C'], np.arange(1, 11, 1)], names=['Name','Num'])
df = pd.DataFrame(np.random.randn(30), i, columns=['Vals'])

and sample map:

和样本地图:

a = list('abcdefghijk')
m = {}
for i in range(0,11):
    m[i] = a[i]

I want to create a column X containing the letter associated with the second index level:

我想创建一个包含与第二个索引级别关联的字母的列X:

df.assign(X=m[df.index.get_level_values('Num').values])

But that doesn't work, and neither does:

但这不起作用,也不起作用:

df['X'] = df.index.map(lambda x: m[x[1]])

3 个解决方案

#1


2  

Access the second level with get_level_values, convert to a Series, and call map/replace -

使用get_level_values访问第二级,转换为Series,并调用map / replace -

df['X'] = df.index.get_level_values(1).to_series().map(m).values

Or,

要么,

df['X'] = df.index.get_level_values(1).to_series().replace(m).values

Alternatively (inspired by OP), you can call map on df.index.get_level_values, and pass a callable (in this case, it would be m.get) -

或者(受OP启发),您可以在df.index.get_level_values上调用map,并传递一个可调用的(在这种情况下,它将是m.get) -

df['X'] = df.index.get_level_values(1).map(m.get)

df

              Vals  X
Name Num             
A    1    2.731237  b
     2    0.180595  c
     3   -1.428064  d
     4   -0.622806  e
     5    0.948709  f
     6   -1.383310  g
     7    0.177631  h
     8   -1.071445  i
     9   -0.183859  j
     10   1.480641  k
B    1   -1.036380  b
     2    1.031757  c
     3    0.542989  d
     4   -0.933676  e
     5   -0.540661  f
     6   -0.506969  g
     7    0.572705  h
     8   -1.363675  i
     9   -0.588765  j
     10   0.998691  k
C    1   -0.471536  b
     2   -1.361124  c
     3   -0.382200  d
     4    0.694174  e
     5    1.077779  f
     6   -0.501285  g
     7    0.961986  h
     8   -0.285009  i
     9    1.385881  j
     10   1.490152  k

Here, I've got to call .values because I want to be able to assign the result back to the dataframe without indexing alignment issues.

在这里,我必须调用.values,因为我希望能够将结果分配回数据帧而不会对齐对齐问题。

#2


3  

Here is another shorthand that works:

这是另一种有效的简写:

df['X'] = df.index.map(lambda x: m.get(x[1]))

It is not invalid to use a dictionary in a lambda like that, it's just that (apparently) the index notation of dictionary value (e.g., m[x[1]]) lookup doesn't work in this situation.

在这样的lambda中使用字典并非无效,只是(显然)字典值的索引符号(例如,m [x [1]])查找在这种情况下不起作用。

#3


2  

rename it then assign it back

重命名然后将其分配回来

df['New']=df.rename(index=m,level=1).index.get_level_values(1)
df
Out[132]: 
              Vals New
Name Num              
A    1   -0.906266   b
     2    0.321047   c
     3    0.227720   d
     4    3.040522   e
     5    0.604392   f
     6    1.394153   g
     7   -0.640342   h
     8   -0.812858   i
     9   -1.142764   j
     10   0.744968   k
B    1    0.956003   b
     2    0.064266   c
     3    0.042286   d
     4   -1.089578   e
     5    0.534922   f
     6   -0.545524   g
     7    0.102778   h
     8   -1.691460   i
     9   -1.980935   j
     10   1.226609   k
C    1    0.871654   b
     2    0.396818   c
     3    0.691537   d
     4    1.923429   e
     5    0.239363   f
     6   -0.669168   g
     7   -0.168082   h
     8    0.209918   i
     9    0.205527   j
     10   0.490754   k

#1


2  

Access the second level with get_level_values, convert to a Series, and call map/replace -

使用get_level_values访问第二级,转换为Series,并调用map / replace -

df['X'] = df.index.get_level_values(1).to_series().map(m).values

Or,

要么,

df['X'] = df.index.get_level_values(1).to_series().replace(m).values

Alternatively (inspired by OP), you can call map on df.index.get_level_values, and pass a callable (in this case, it would be m.get) -

或者(受OP启发),您可以在df.index.get_level_values上调用map,并传递一个可调用的(在这种情况下,它将是m.get) -

df['X'] = df.index.get_level_values(1).map(m.get)

df

              Vals  X
Name Num             
A    1    2.731237  b
     2    0.180595  c
     3   -1.428064  d
     4   -0.622806  e
     5    0.948709  f
     6   -1.383310  g
     7    0.177631  h
     8   -1.071445  i
     9   -0.183859  j
     10   1.480641  k
B    1   -1.036380  b
     2    1.031757  c
     3    0.542989  d
     4   -0.933676  e
     5   -0.540661  f
     6   -0.506969  g
     7    0.572705  h
     8   -1.363675  i
     9   -0.588765  j
     10   0.998691  k
C    1   -0.471536  b
     2   -1.361124  c
     3   -0.382200  d
     4    0.694174  e
     5    1.077779  f
     6   -0.501285  g
     7    0.961986  h
     8   -0.285009  i
     9    1.385881  j
     10   1.490152  k

Here, I've got to call .values because I want to be able to assign the result back to the dataframe without indexing alignment issues.

在这里,我必须调用.values,因为我希望能够将结果分配回数据帧而不会对齐对齐问题。

#2


3  

Here is another shorthand that works:

这是另一种有效的简写:

df['X'] = df.index.map(lambda x: m.get(x[1]))

It is not invalid to use a dictionary in a lambda like that, it's just that (apparently) the index notation of dictionary value (e.g., m[x[1]]) lookup doesn't work in this situation.

在这样的lambda中使用字典并非无效,只是(显然)字典值的索引符号(例如,m [x [1]])查找在这种情况下不起作用。

#3


2  

rename it then assign it back

重命名然后将其分配回来

df['New']=df.rename(index=m,level=1).index.get_level_values(1)
df
Out[132]: 
              Vals New
Name Num              
A    1   -0.906266   b
     2    0.321047   c
     3    0.227720   d
     4    3.040522   e
     5    0.604392   f
     6    1.394153   g
     7   -0.640342   h
     8   -0.812858   i
     9   -1.142764   j
     10   0.744968   k
B    1    0.956003   b
     2    0.064266   c
     3    0.042286   d
     4   -1.089578   e
     5    0.534922   f
     6   -0.545524   g
     7    0.102778   h
     8   -1.691460   i
     9   -1.980935   j
     10   1.226609   k
C    1    0.871654   b
     2    0.396818   c
     3    0.691537   d
     4    1.923429   e
     5    0.239363   f
     6   -0.669168   g
     7   -0.168082   h
     8    0.209918   i
     9    0.205527   j
     10   0.490754   k