i have a data frame like this
我有这样的数据框
print(testDB)
0 1 2
0 354.7 April 4.0
1 55.4 August 8.0
2 176.5 December 12.0
3 95.5 February 2.0
4 85.6 January 1.0
5 152 July 7.0
6 238.7 June 6.0
7 104.8 March 3.0
8 283.5 May 5.0
9 278.8 November 11.0
10 249.6 October 10.0
11 212.7 September 9.0
as you can see months are not in calendar order. so i created a second column and get the month number relevant to that month. from that how can i sort this data frame according to calendar months order. thanks
你可以看到几个月没有按日历顺序排列。所以我创建了第二列,并获得与该月相关的月份数。从那个我如何根据日历月份顺序对此数据框进行排序。谢谢
2 个解决方案
#1
82
Use sort_values
to sort the df by a specific column's values:
使用sort_values按特定列的值对df进行排序:
In [18]:
df.sort_values('2')
Out[18]:
0 1 2
4 85.6 January 1.0
3 95.5 February 2.0
7 104.8 March 3.0
0 354.7 April 4.0
8 283.5 May 5.0
6 238.7 June 6.0
5 152.0 July 7.0
1 55.4 August 8.0
11 212.7 September 9.0
10 249.6 October 10.0
9 278.8 November 11.0
2 176.5 December 12.0
If you want to sort by two columns, pass a list of column labels to sort_values
with the column labels ordered according to sort priority. If you use df.sort_values(['2', '0'])
, the result would be sorted by column 2
then column 0
. Granted, this does not really make sense for this example because each value in df['2']
is unique.
如果要按两列排序,请将列标签列表传递给sort_values,并根据排序优先级排序列标签。如果你使用df.sort_values(['2','0']),结果将按第2列然后第0列排序。当然,这对于这个例子没有意义,因为df ['2'中的每个值] 是独特的。
#2
0
Just adding some more operations on data. Suppose we have a dataframe df
, we can do several operations to get desired outputs
只需在数据上添加更多操作。假设我们有一个数据帧df,我们可以做几个操作来获得所需的输出
ID cost tax label
1 216590 1600 test
2 523213 1800 test
3 250 1500 experiment
df['label'].value_counts().to_frame().reset_index()).sort_values('label', ascending = False)
will give sorted
output of labels as a dataframe
将标签的排序输出作为数据帧
index label
0 test 2
1 experiment 1
#1
82
Use sort_values
to sort the df by a specific column's values:
使用sort_values按特定列的值对df进行排序:
In [18]:
df.sort_values('2')
Out[18]:
0 1 2
4 85.6 January 1.0
3 95.5 February 2.0
7 104.8 March 3.0
0 354.7 April 4.0
8 283.5 May 5.0
6 238.7 June 6.0
5 152.0 July 7.0
1 55.4 August 8.0
11 212.7 September 9.0
10 249.6 October 10.0
9 278.8 November 11.0
2 176.5 December 12.0
If you want to sort by two columns, pass a list of column labels to sort_values
with the column labels ordered according to sort priority. If you use df.sort_values(['2', '0'])
, the result would be sorted by column 2
then column 0
. Granted, this does not really make sense for this example because each value in df['2']
is unique.
如果要按两列排序,请将列标签列表传递给sort_values,并根据排序优先级排序列标签。如果你使用df.sort_values(['2','0']),结果将按第2列然后第0列排序。当然,这对于这个例子没有意义,因为df ['2'中的每个值] 是独特的。
#2
0
Just adding some more operations on data. Suppose we have a dataframe df
, we can do several operations to get desired outputs
只需在数据上添加更多操作。假设我们有一个数据帧df,我们可以做几个操作来获得所需的输出
ID cost tax label
1 216590 1600 test
2 523213 1800 test
3 250 1500 experiment
df['label'].value_counts().to_frame().reset_index()).sort_values('label', ascending = False)
will give sorted
output of labels as a dataframe
将标签的排序输出作为数据帧
index label
0 test 2
1 experiment 1