根据熊猫数据存储器中的两个列对多个列进行排序

时间:2021-04-09 22:34:38

I have a dataframe with multiple alphabetical values which I want to sort. For instance

我有一个带有多个字母值的dataframe,我想对它们进行排序。例如

ii     A.1     A.2      B.1     B.2
1      Xy      foo      Ly      bar
2      Ab      bar      Ko      foo

So I'd like to sort each row according to A.1 and B.1, and reorder A.2 and B.2 according to that order. This would become:

我想根据a。1和b。1对每一行进行排序,然后根据这个顺序重新排序a。2和b。这将成为:

ii     s1      s2       b1      b2
1      Ly      bar      Xy      foo
2      Ab      bar      Ko      foo

I am trying to use df.apply(lambda x: x.sort_values()). However, I am having problems changing the order of the additional columns (A.2 and B.2). How would you do this?

我在尝试使用df。应用(λx:x.sort_values())。但是,我在更改附加列(A.2和B.2)的顺序时遇到了问题。你会怎么做?

Edit: to clarify, I need to sort A.2 B.2 according to the order specified by the sorted A.1 and B.1. For instance:

编辑:为了澄清,我需要按照A.1和B.1排序。例如:

ii     A.1     A.2      B.1     B.2
1      Xy      mat      Ly      bar
2      Ab      zul      Ko      foo #shouldn't change

becomes:

就变成:

ii     A.1     A.2      B.1     B.2
1      Ly      bar      Xy      mat
2      Ab      zul      Ko      foo #notice, this is unchanged because A.1 B.1 are already sorted 

1 个解决方案

#1


2  

I believe need numpy.argsort for positions by sorted array and then get values by indices in arr and assign back:

我认为需要numpy。根据排序后的数组对位置进行argsort,然后按arr中的索引获取值,并进行赋值:

arr = df[['A.1', 'B.1']].values.argsort()
print (arr)
[[1 0]
 [0 1]]

df[['A.1', 'B.1']] = df[['A.1', 'B.1']].values[np.arange(len(arr))[:,None], arr]
df[['A.2', 'B.2']] = df[['A.2', 'B.2']].values[np.arange(len(arr))[:,None], arr]
print (df)
   ii A.1  A.2 B.1  B.2
0   1  Ly  bar  Xy  foo
1   2  Ab  bar  Ko  foo

With new data:

新数据:

print (df)
   ii A.1  A.2 B.1  B.2
0   1  Ly  bar  Xy  mat
1   2  Ab  zul  Ko  foo

#1


2  

I believe need numpy.argsort for positions by sorted array and then get values by indices in arr and assign back:

我认为需要numpy。根据排序后的数组对位置进行argsort,然后按arr中的索引获取值,并进行赋值:

arr = df[['A.1', 'B.1']].values.argsort()
print (arr)
[[1 0]
 [0 1]]

df[['A.1', 'B.1']] = df[['A.1', 'B.1']].values[np.arange(len(arr))[:,None], arr]
df[['A.2', 'B.2']] = df[['A.2', 'B.2']].values[np.arange(len(arr))[:,None], arr]
print (df)
   ii A.1  A.2 B.1  B.2
0   1  Ly  bar  Xy  foo
1   2  Ab  bar  Ko  foo

With new data:

新数据:

print (df)
   ii A.1  A.2 B.1  B.2
0   1  Ly  bar  Xy  mat
1   2  Ab  zul  Ko  foo