如何使用来自几个列的值对熊猫数据帧进行排序?

时间:2022-11-19 01:38:49

I have the following data frame:

我有以下数据框架:

df = pandas.DataFrame([{'c1':3,'c2':10},{'c1':2, 'c2':30},{'c1':1,'c2':20},{'c1':2,'c2':15},{'c1':2,'c2':100}])

Or, in human readable form:

或以人类可读的形式:

   c1   c2
0   3   10
1   2   30
2   1   20
3   2   15
4   2  100

The following sorting-command works as expected:

以下排序命令按预期工作:

df.sort(['c1','c2'], ascending=False)

Output:

输出:

   c1   c2
0   3   10
4   2  100
1   2   30
3   2   15
2   1   20

But the following command:

但下面的命令:

df.sort(['c1','c2'], ascending=[False,True])

results in

结果

   c1   c2
2   1   20
3   2   15
1   2   30
4   2  100
0   3   10

and this is not what I expect. I expect to have the values in the first column ordered from largest to smallest, and if there are identical values in the first column, order by the ascending values from the second column.

这不是我想要的。我希望在第一列中有从最大到最小的顺序,如果第一列中有相同的值,则由第二列的升序值排序。

Does anybody know why it does not work as expected?

有人知道为什么它不能像预期的那样工作吗?

ADDED

添加

This is copy-paste:

这是复制粘贴:

>>> df.sort(['c1','c2'], ascending=[False,True])
   c1   c2
2   1   20
3   2   15
1   2   30
4   2  100
0   3   10

7 个解决方案

#1


54  

UPDATE DataFrame.sort is deprecated; use DataFrame.sort_values.

更新DataFrame。排序是弃用;使用DataFrame.sort_values。

>>> df.sort_values(['c1','c2'], ascending=[False,True])
   c1   c2
0   3   10
3   2   15
1   2   30
4   2  100
2   1   20
>>> df.sort(['c1','c2'], ascending=[False,True])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ampawake/anaconda/envs/pseudo/lib/python2.7/site-packages/pandas/core/generic.py", line 3614, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'sort'

ORIGINAL ANSWER

原来的答案

Your code works for me.

你的代码对我有用。

>>> import pandas
>>> df = pandas.DataFrame([{'c1':3,'c2':10},{'c1':2, 'c2':30},{'c1':1,'c2':20},{'c1':2,'c2':15},{'c1':2,'c2':100}])
>>> df.sort(['c1','c2'], ascending=[False,True])
   c1   c2
0   3   10
3   2   15
1   2   30
4   2  100
2   1   20

Did you paste as is?

你粘贴了吗?

>>> df.sort(['c1','c2'], ascending=[True,True])
   c1   c2
2   1   20
3   2   15
1   2   30
4   2  100
0   3   10

#2


23  

Use of sort can result in warning message. See github discussion. So you might wanna use sort_values, docs here

使用sort会导致警告消息。看到github的讨论。你可能想用sort_values,这里的docs

Then your code can look like this:

那么您的代码可以如下所示:

df = df.sort_values(by=['c1','c2'], ascending=[False,True])

#3


7  

The dataframe.sort() method is - so my understanding - deprecated in pandas > 0.18. In order to solve your problem you should use dataframe.sort_values() instead:

dataframe.sort()方法—我的理解—不赞成在熊猫> 0.18中使用。为了解决您的问题,您应该使用dataframe.sort_values()代替:

f.sort_values(by=["c1","c2"], ascending=[False, True])

The output looks like this:

输出如下:

    c1  c2
    3   10
    2   15
    2   30
    2   100
    1   20

#4


4  

In my case, the accepted answer didn't work:

在我的例子中,公认的答案并不奏效:

f.sort_values(by=["c1","c2"], ascending=[False, True])

f。sort_values(=[c1,c2”],提升=[假,真])

Only the following worked as expected:

只有以下内容符合预期:

f = f.sort_values(by=["c1","c2"], ascending=[False, True])

#5


2  

If you are writing this code as a script file then you will have to write it like this:

如果您将此代码作为脚本文件编写,那么您必须这样编写:

df = df.sort(['c1','c2'], ascending=[False,True])

#6


1  

I have found this to be really useful:

我发现这非常有用:

df = pd.DataFrame({'A' : range(0,10) * 2, 'B' : np.random.randint(20,30,20)})

# A ascending, B descending
df.sort(**skw(columns=['A','-B']))

# A descending, B ascending
df.sort(**skw(columns=['-A','+B']))

Note that unlike the standard columns=,ascending= arguments, here column names and their sort order are in the same place. As a result your code gets a lot easier to read and maintain.

请注意,与标准列=、升序=参数不同,这里的列名和它们的排序顺序位于相同的位置。因此,您的代码更容易阅读和维护。

Note the actual call to .sort is unchanged, skw (sortkwargs) is just a small helper function that parses the columns and returns the usual columns= and ascending= parameters for you. Pass it any other sort kwargs as you usually would. Copy/paste the following code into e.g. your local utils.py then forget about it and just use it as above.

注意,对.sort的实际调用没有改变,skw (sortkwargs)只是一个小的助手函数,它解析列并为您返回通常的列=和升序=参数。像往常一样传递任何其他类型的kwarg。复制/粘贴以下代码到你的本地库中。然后把它忘掉,像上面一样使用它。

# utils.py (or anywhere else convenient to import)
def skw(columns=None, **kwargs):
    """ get sort kwargs by parsing sort order given in column name """
    # set default order as ascending (+)
    sort_cols = ['+' + col if col[0] != '-' else col for col in columns]
    # get sort kwargs
    columns, ascending = zip(*[(col.replace('+', '').replace('-', ''), 
                                False if col[0] == '-' else True) 
                               for col in sort_cols])
    kwargs.update(dict(columns=list(columns), ascending=ascending))
    return kwargs

#7


0  

Note : Everything up here is correct,just replace sort --> sort_values() So, it becomes:

注意:上面这里的所有内容都是正确的,只需替换sort——> sort_values()就可以得到:

 import pandas as pd
 df = pd.read_csv('data.csv')
 df.sort_values(ascending=False,inplace=True)

Refer to the official website here.

请参阅这里的官方网站。

#1


54  

UPDATE DataFrame.sort is deprecated; use DataFrame.sort_values.

更新DataFrame。排序是弃用;使用DataFrame.sort_values。

>>> df.sort_values(['c1','c2'], ascending=[False,True])
   c1   c2
0   3   10
3   2   15
1   2   30
4   2  100
2   1   20
>>> df.sort(['c1','c2'], ascending=[False,True])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ampawake/anaconda/envs/pseudo/lib/python2.7/site-packages/pandas/core/generic.py", line 3614, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'sort'

ORIGINAL ANSWER

原来的答案

Your code works for me.

你的代码对我有用。

>>> import pandas
>>> df = pandas.DataFrame([{'c1':3,'c2':10},{'c1':2, 'c2':30},{'c1':1,'c2':20},{'c1':2,'c2':15},{'c1':2,'c2':100}])
>>> df.sort(['c1','c2'], ascending=[False,True])
   c1   c2
0   3   10
3   2   15
1   2   30
4   2  100
2   1   20

Did you paste as is?

你粘贴了吗?

>>> df.sort(['c1','c2'], ascending=[True,True])
   c1   c2
2   1   20
3   2   15
1   2   30
4   2  100
0   3   10

#2


23  

Use of sort can result in warning message. See github discussion. So you might wanna use sort_values, docs here

使用sort会导致警告消息。看到github的讨论。你可能想用sort_values,这里的docs

Then your code can look like this:

那么您的代码可以如下所示:

df = df.sort_values(by=['c1','c2'], ascending=[False,True])

#3


7  

The dataframe.sort() method is - so my understanding - deprecated in pandas > 0.18. In order to solve your problem you should use dataframe.sort_values() instead:

dataframe.sort()方法—我的理解—不赞成在熊猫> 0.18中使用。为了解决您的问题,您应该使用dataframe.sort_values()代替:

f.sort_values(by=["c1","c2"], ascending=[False, True])

The output looks like this:

输出如下:

    c1  c2
    3   10
    2   15
    2   30
    2   100
    1   20

#4


4  

In my case, the accepted answer didn't work:

在我的例子中,公认的答案并不奏效:

f.sort_values(by=["c1","c2"], ascending=[False, True])

f。sort_values(=[c1,c2”],提升=[假,真])

Only the following worked as expected:

只有以下内容符合预期:

f = f.sort_values(by=["c1","c2"], ascending=[False, True])

#5


2  

If you are writing this code as a script file then you will have to write it like this:

如果您将此代码作为脚本文件编写,那么您必须这样编写:

df = df.sort(['c1','c2'], ascending=[False,True])

#6


1  

I have found this to be really useful:

我发现这非常有用:

df = pd.DataFrame({'A' : range(0,10) * 2, 'B' : np.random.randint(20,30,20)})

# A ascending, B descending
df.sort(**skw(columns=['A','-B']))

# A descending, B ascending
df.sort(**skw(columns=['-A','+B']))

Note that unlike the standard columns=,ascending= arguments, here column names and their sort order are in the same place. As a result your code gets a lot easier to read and maintain.

请注意,与标准列=、升序=参数不同,这里的列名和它们的排序顺序位于相同的位置。因此,您的代码更容易阅读和维护。

Note the actual call to .sort is unchanged, skw (sortkwargs) is just a small helper function that parses the columns and returns the usual columns= and ascending= parameters for you. Pass it any other sort kwargs as you usually would. Copy/paste the following code into e.g. your local utils.py then forget about it and just use it as above.

注意,对.sort的实际调用没有改变,skw (sortkwargs)只是一个小的助手函数,它解析列并为您返回通常的列=和升序=参数。像往常一样传递任何其他类型的kwarg。复制/粘贴以下代码到你的本地库中。然后把它忘掉,像上面一样使用它。

# utils.py (or anywhere else convenient to import)
def skw(columns=None, **kwargs):
    """ get sort kwargs by parsing sort order given in column name """
    # set default order as ascending (+)
    sort_cols = ['+' + col if col[0] != '-' else col for col in columns]
    # get sort kwargs
    columns, ascending = zip(*[(col.replace('+', '').replace('-', ''), 
                                False if col[0] == '-' else True) 
                               for col in sort_cols])
    kwargs.update(dict(columns=list(columns), ascending=ascending))
    return kwargs

#7


0  

Note : Everything up here is correct,just replace sort --> sort_values() So, it becomes:

注意:上面这里的所有内容都是正确的,只需替换sort——> sort_values()就可以得到:

 import pandas as pd
 df = pd.read_csv('data.csv')
 df.sort_values(ascending=False,inplace=True)

Refer to the official website here.

请参阅这里的官方网站。