
时间:2021-04-08 13:08:58

This question already has an answer here:


Sorry if I am asking a trivial question, but the fact is I have spent a few hours reading answers in this data base and could not find what I am looking for.


I have a dataframe similar to this


df=data.frame(v1=c(24,15, 0, 7,36,10), c1=c(22,15,0,0,28,11), v2=c(0,10,0,19,0,0), c2=c(0,7,0,22,0,0), v3=c(54,22,28,55,62,38), c3=c(44,23,22,66,71,44))

(The original, of course, has many more rows and columns)


I would like to create two columns with the maximum and the second highest values of all the "v" columns.


For the maximum, this works:


df$max.v=mapply(FUN=max, df$v1, df$v2, df$v3, na.rm=TRUE)

But I cannot find a way to do it for the second highest value. It probably needs some kind of function, but I could not find how to do it.


1 个解决方案



Note that the accepted answer in the question linked by @krlmlr is dubious, because apply can break data frames. It doesn't matter so much in this case, because all the columns must be numeric for the question to make sense, but I prefer to err on the safe side.

请注意,@ krlmlr链接的问题中接受的答案是可疑的,因为apply可能会破坏数据帧。在这种情况下,这并不重要,因为所有列都必须是数字才能使问题有意义,但我宁愿在安全方面犯错。

Instead, use do.call with mapply, and persuade it to treat a df as a list:


do.call(mapply, c(function(...) sort(c(...), dec=TRUE)[1:2],
        df[grepl("v", names(df))]))



Note that the accepted answer in the question linked by @krlmlr is dubious, because apply can break data frames. It doesn't matter so much in this case, because all the columns must be numeric for the question to make sense, but I prefer to err on the safe side.

请注意,@ krlmlr链接的问题中接受的答案是可疑的,因为apply可能会破坏数据帧。在这种情况下,这并不重要,因为所有列都必须是数字才能使问题有意义,但我宁愿在安全方面犯错。

Instead, use do.call with mapply, and persuade it to treat a df as a list:


do.call(mapply, c(function(...) sort(c(...), dec=TRUE)[1:2],
        df[grepl("v", names(df))]))