如何对数据进行排序和过滤?

时间:2022-12-25 22:50:37

I understand how to sort a data frame:

我知道如何对数据框进行排序:

df[order(df$Height),]

df(订单(df高度美元),)

and I understand how to filter (or subset) a data frame matching some predicate:

我知道如何过滤(或子集)匹配某个谓词的数据帧:

df[df$Weight > 120,]

df(df体重> 120美元,)

but how do I sort and filter (as an example, order by Height and filter by Weight)?

但是我如何排序和过滤(例如,按高度排序和按重量过滤)?

3 个解决方案

#1


14  

Either in two steps

在两个步骤

 df1 <- df[df$weight > 120, ]
 df2 <- df1[order(df1$height), ]

or if you must in one step -- but it really is not any cleaner.

或者如果你必须一步,但它真的不是任何清洁剂。

Data first:

数据:

R> set.seed(42)
R> df <- data.frame(weight=rnorm(10, 120, 10), height=rnorm(10, 160, 20))
R> df
   weight height
1   133.7  186.1
2   114.4  205.7
3   123.6  132.2
4   126.3  154.4
5   124.0  157.3
6   118.9  172.7
7   135.1  154.3
8   119.1  106.9
9   140.2  111.2
10  119.4  186.4

And one way of doing it is double-subsetting:

一种方法是双子集设定

R> subset(df, weight > 120)[order(subset(df, weight > 120)$height),]
  weight height
9  140.2  111.2
3  123.6  132.2
7  135.1  154.3
4  126.3  154.4
5  124.0  157.3
1  133.7  186.1
R> 

I'd go with the two-step.

我选择两步。

#2


11  

The package data.table allows you to this in one short line of code:

包数据。表允许您在一个简短的代码行:

Borrowing Dirk Eddelbuettel's example, set up some data:

借用德克·埃德尔布托尔的例子,建立了一些数据:

set.seed(42)
df <- data.frame(weight=rnorm(10, 120, 10), height=rnorm(10, 160, 20))

Convert the data.frame to a data.table and subset on weight, ordering by height:

将data.frame转换为数据。表及体重子集,按身高排序:

library(data.table)
dt <- data.table(df)

dt[weight>120][order(height)]

       weight   height
[1,] 140.1842 111.1907
[2,] 123.6313 132.2228
[3,] 135.1152 154.3149
[4,] 126.3286 154.4242
[5,] 124.0427 157.3336
[6,] 133.7096 186.0974

#3


2  

df1 <- df[order(df$height), ][df$weight > 120, ]

Just make sure to put the order before the filter.

确保把顺序放在过滤器之前。

#1


14  

Either in two steps

在两个步骤

 df1 <- df[df$weight > 120, ]
 df2 <- df1[order(df1$height), ]

or if you must in one step -- but it really is not any cleaner.

或者如果你必须一步,但它真的不是任何清洁剂。

Data first:

数据:

R> set.seed(42)
R> df <- data.frame(weight=rnorm(10, 120, 10), height=rnorm(10, 160, 20))
R> df
   weight height
1   133.7  186.1
2   114.4  205.7
3   123.6  132.2
4   126.3  154.4
5   124.0  157.3
6   118.9  172.7
7   135.1  154.3
8   119.1  106.9
9   140.2  111.2
10  119.4  186.4

And one way of doing it is double-subsetting:

一种方法是双子集设定

R> subset(df, weight > 120)[order(subset(df, weight > 120)$height),]
  weight height
9  140.2  111.2
3  123.6  132.2
7  135.1  154.3
4  126.3  154.4
5  124.0  157.3
1  133.7  186.1
R> 

I'd go with the two-step.

我选择两步。

#2


11  

The package data.table allows you to this in one short line of code:

包数据。表允许您在一个简短的代码行:

Borrowing Dirk Eddelbuettel's example, set up some data:

借用德克·埃德尔布托尔的例子,建立了一些数据:

set.seed(42)
df <- data.frame(weight=rnorm(10, 120, 10), height=rnorm(10, 160, 20))

Convert the data.frame to a data.table and subset on weight, ordering by height:

将data.frame转换为数据。表及体重子集,按身高排序:

library(data.table)
dt <- data.table(df)

dt[weight>120][order(height)]

       weight   height
[1,] 140.1842 111.1907
[2,] 123.6313 132.2228
[3,] 135.1152 154.3149
[4,] 126.3286 154.4242
[5,] 124.0427 157.3336
[6,] 133.7096 186.0974

#3


2  

df1 <- df[order(df$height), ][df$weight > 120, ]

Just make sure to put the order before the filter.

确保把顺序放在过滤器之前。