I have the following data frame which I called ozone:
我有以下的数据框架我称之为ozone:
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6
7 23 299 8.6 65 5 7
8 19 99 13.8 59 5 8
9 8 19 20.1 61 5 9
I would like to extract the highest value from ozone
, Solar.R
, Wind
...
我想从臭氧,太阳能中提取最高的值。R,风……
Also, if possible how would I sort Solar.R
or any column of this data frame in descending order
而且,如果可能的话,我将如何对太阳能进行分类。R或此数据帧的任何列按降序排列
I tried
我试着
max(ozone, na.rm=T)
which gives me the highest value in the dataset.
这给了我数据集中最高的值。
I have also tried
我也试过
max(subset(ozone,Ozone))
but got "subset" must be logical."
但是得到“子集”必须是合乎逻辑的。
I can set an object to hold the subset of each column, by the following commands
我可以通过以下命令设置一个对象来保存每个列的子集
ozone <- subset(ozone, Ozone >0)
max(ozone,na.rm=T)
but it gives the same value of 334, which is the max value of the data frame, not the column.
但它给出的值是334,也就是数据框的最大值,而不是列。
Any help would be great, thanks.
任何帮助都可以,谢谢。
10 个解决方案
#1
38
Similar to colMeans
, colSums
, etc, you could write a column maximum function, colMax
, and a column sort function, colSort
.
与colMeans、colsum等类似,您可以编写一个列最大函数colMax和一个列排序函数colSort。
colMax <- function(data) sapply(data, max, na.rm = TRUE)
colSort <- function(data, ...) sapply(data, sort, ...)
I use ...
in the second function in hopes of sparking your intrigue.
我用……在第二个功能,希望激起你的阴谋。
Get your data:
让你的数据:
dat <- read.table(h=T, text = "Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6
7 23 299 8.6 65 5 7
8 19 99 13.8 59 5 8
9 8 19 20.1 61 5 9")
Use colMax
function on sample data:
在样本数据上使用colMax函数:
colMax(dat)
# Ozone Solar.R Wind Temp Month Day
# 41.0 313.0 20.1 74.0 5.0 9.0
To do the sorting on a single column,
要对单个列进行排序,
sort(dat$Solar.R, decreasing = TRUE)
# [1] 313 299 190 149 118 99 19
and over all columns use our colSort
function,
在所有列上使用colSort函数,
colSort(dat, decreasing = TRUE) ## compare with '...' above
#2
27
To get the max of any column you want something like:
要得到任何列的最大值,你需要:
max(ozone$Ozone, na.rm = TRUE)
To get the max of all columns, you want:
要得到所有列的最大值,需要:
apply(ozone, 2, function(x) max(x, na.rm = TRUE))
And to sort:
和分类:
ozone[order(ozone$Solar.R),]
Or to sort the other direction:
或者从另一个方向排序:
ozone[rev(order(ozone$Solar.R)),]
#3
6
In response to finding the max value for each column, you could try using the apply()
function:
为了找到每个列的最大值,您可以尝试使用apply()函数:
> apply(ozone, MARGIN = 2, function(x) max(x, na.rm=TRUE))
Ozone Solar.R Wind Temp Month Day
41.0 313.0 20.1 74.0 5.0 9.0
#4
5
Here's a dplyr
solution:
这里有一个dplyr解决方案:
library(dplyr)
# find max for each column
summarise_each(ozone, funs(max(., na.rm=TRUE)))
# sort by Solar.R, descending
arrange(ozone, desc(Solar.R))
UPDATE: summarise_each()
has been deprecated in favour of a more featureful family of functions: mutate_all()
, mutate_at()
, mutate_if()
, summarise_all()
, summarise_at()
, summarise_if()
更新:已经弃用了summary _each(),代之以功能更丰富的函数族:mutate_all()、mutate_at()、mutate_if()、汇总_all()、汇总_at()、汇总_if()
Here is how you could do:
你可以这样做:
# find max for each column
ozone %>%
summarise_if(is.numeric, funs(max(., na.rm=TRUE)))%>%
arrange(Ozone)
or
或
ozone %>%
summarise_at(vars(1:6), funs(max(., na.rm=TRUE)))%>%
arrange(Ozone)
#5
2
Another way would be to use ?pmax
另一种方法是使用?pmax
do.call('pmax', c(as.data.frame(t(ozone)),na.rm=TRUE))
#[1] 41.0 313.0 20.1 74.0 5.0 9.0
#6
1
Assuming that your data in data.frame
called maxinozone
, you can do this
假设你的数据在data.frame中被称为maxinozone,你可以这样做。
max(maxinozone[1, ], na.rm = TRUE)
#7
0
max(ozone$Ozone, na.rm = TRUE)
should do the trick. Remember to include the na.rm = TRUE
or else R will return NA.
max($臭氧,臭氧na。rm = TRUE)应该是这样的。记住要包含na。rm = TRUE,否则R将返回NA。
#8
0
max(may$Ozone, na.rm = TRUE)
Without $Ozone
it will filter in the whole data frame, this can be learned in the swirl library.
在没有$Ozone的情况下,它会过滤整个数据框架,这可以在vortex库中学到。
I'm studying this course on Coursera too ~
我也在Coursera上学习这门课程
#9
0
Try this solution:
试试这个解决方案:
Oz<-subset(data, data$Month==5,select=Ozone) # select ozone value in the month of
#May (i.e. Month = 5)
summary(T) #gives caracteristics of table( contains 1 column of Ozone) including max, min ...
#10
0
There is a package matrixStats
that provides some functions to do column and row summaries, see in the package vignette, but you have to convert your data.frame into a matrix.
有一个package matrixStats,它提供了一些函数来执行列和行摘要,在package vignette中看到,但是您必须将您的数据。frame转换成一个矩阵。
Then you run: colMaxs(as.matrix(ozone))
然后运行:colMaxs(as.matrix(臭氧))
#1
38
Similar to colMeans
, colSums
, etc, you could write a column maximum function, colMax
, and a column sort function, colSort
.
与colMeans、colsum等类似,您可以编写一个列最大函数colMax和一个列排序函数colSort。
colMax <- function(data) sapply(data, max, na.rm = TRUE)
colSort <- function(data, ...) sapply(data, sort, ...)
I use ...
in the second function in hopes of sparking your intrigue.
我用……在第二个功能,希望激起你的阴谋。
Get your data:
让你的数据:
dat <- read.table(h=T, text = "Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6
7 23 299 8.6 65 5 7
8 19 99 13.8 59 5 8
9 8 19 20.1 61 5 9")
Use colMax
function on sample data:
在样本数据上使用colMax函数:
colMax(dat)
# Ozone Solar.R Wind Temp Month Day
# 41.0 313.0 20.1 74.0 5.0 9.0
To do the sorting on a single column,
要对单个列进行排序,
sort(dat$Solar.R, decreasing = TRUE)
# [1] 313 299 190 149 118 99 19
and over all columns use our colSort
function,
在所有列上使用colSort函数,
colSort(dat, decreasing = TRUE) ## compare with '...' above
#2
27
To get the max of any column you want something like:
要得到任何列的最大值,你需要:
max(ozone$Ozone, na.rm = TRUE)
To get the max of all columns, you want:
要得到所有列的最大值,需要:
apply(ozone, 2, function(x) max(x, na.rm = TRUE))
And to sort:
和分类:
ozone[order(ozone$Solar.R),]
Or to sort the other direction:
或者从另一个方向排序:
ozone[rev(order(ozone$Solar.R)),]
#3
6
In response to finding the max value for each column, you could try using the apply()
function:
为了找到每个列的最大值,您可以尝试使用apply()函数:
> apply(ozone, MARGIN = 2, function(x) max(x, na.rm=TRUE))
Ozone Solar.R Wind Temp Month Day
41.0 313.0 20.1 74.0 5.0 9.0
#4
5
Here's a dplyr
solution:
这里有一个dplyr解决方案:
library(dplyr)
# find max for each column
summarise_each(ozone, funs(max(., na.rm=TRUE)))
# sort by Solar.R, descending
arrange(ozone, desc(Solar.R))
UPDATE: summarise_each()
has been deprecated in favour of a more featureful family of functions: mutate_all()
, mutate_at()
, mutate_if()
, summarise_all()
, summarise_at()
, summarise_if()
更新:已经弃用了summary _each(),代之以功能更丰富的函数族:mutate_all()、mutate_at()、mutate_if()、汇总_all()、汇总_at()、汇总_if()
Here is how you could do:
你可以这样做:
# find max for each column
ozone %>%
summarise_if(is.numeric, funs(max(., na.rm=TRUE)))%>%
arrange(Ozone)
or
或
ozone %>%
summarise_at(vars(1:6), funs(max(., na.rm=TRUE)))%>%
arrange(Ozone)
#5
2
Another way would be to use ?pmax
另一种方法是使用?pmax
do.call('pmax', c(as.data.frame(t(ozone)),na.rm=TRUE))
#[1] 41.0 313.0 20.1 74.0 5.0 9.0
#6
1
Assuming that your data in data.frame
called maxinozone
, you can do this
假设你的数据在data.frame中被称为maxinozone,你可以这样做。
max(maxinozone[1, ], na.rm = TRUE)
#7
0
max(ozone$Ozone, na.rm = TRUE)
should do the trick. Remember to include the na.rm = TRUE
or else R will return NA.
max($臭氧,臭氧na。rm = TRUE)应该是这样的。记住要包含na。rm = TRUE,否则R将返回NA。
#8
0
max(may$Ozone, na.rm = TRUE)
Without $Ozone
it will filter in the whole data frame, this can be learned in the swirl library.
在没有$Ozone的情况下,它会过滤整个数据框架,这可以在vortex库中学到。
I'm studying this course on Coursera too ~
我也在Coursera上学习这门课程
#9
0
Try this solution:
试试这个解决方案:
Oz<-subset(data, data$Month==5,select=Ozone) # select ozone value in the month of
#May (i.e. Month = 5)
summary(T) #gives caracteristics of table( contains 1 column of Ozone) including max, min ...
#10
0
There is a package matrixStats
that provides some functions to do column and row summaries, see in the package vignette, but you have to convert your data.frame into a matrix.
有一个package matrixStats,它提供了一些函数来执行列和行摘要,在package vignette中看到,但是您必须将您的数据。frame转换成一个矩阵。
Then you run: colMaxs(as.matrix(ozone))
然后运行:colMaxs(as.matrix(臭氧))