This question already has an answer here:
这个问题在这里已有答案:
- Extract the maximum value within each group in a dataframe [duplicate] 3 answers
- 提取数据框中每个组内的最大值[重复] 3个答案
If I have the data frame provided below is there a way to select for the highest IDs for all the genes.
如果我有下面提供的数据框,那么有一种方法可以选择所有基因的最高ID。
gene_name <- c("AADACL2", "AADACL3", "AADACL4", "AADACL4", "AADACL4", "AADACL4", "AADACL4", "AADACL4")
target_id <- c(79.0524, 62.0098, 61.6708, 65.1106, 58.6207, 63.9706, 64.3735, 61.3232)
table <- data.frame(gene_name = gene_name, id = target_id)
I want a dataframe that looks something like this instead:
我想要一个看起来像这样的数据框:
gene_name_2 <- c("AADACL2", "AADACL3", "AADACL4")
target_id_2 <- c(79.0524, 62.0098, , 65.1106)
table_2 <- data.frame(gene_name = gene_name_2, id = target_id_2)
I have a much bigger set of data than this so need to do it for a lot of genes, I just can't work out a way to do it
我有一个比这更大的数据集,所以需要为很多基因做这个,我只是无法找到一种方法来做到这一点
1 个解决方案
#1
0
aggregate(.~gene_name,table,max)
gene_name id
1 AADACL2 79.0524
2 AADACL3 62.0098
3 AADACL4 65.1106
library(tidyverse)
table%>%group_by(gene_name)%>%arrange(desc(id))%>%top_n(1,id)
# A tibble: 3 x 2
# Groups: gene_name [3]
gene_name id
<fctr> <dbl>
1 AADACL2 79.0524
2 AADACL4 65.1106
3 AADACL3 62.0098
#1
0
aggregate(.~gene_name,table,max)
gene_name id
1 AADACL2 79.0524
2 AADACL3 62.0098
3 AADACL4 65.1106
library(tidyverse)
table%>%group_by(gene_name)%>%arrange(desc(id))%>%top_n(1,id)
# A tibble: 3 x 2
# Groups: gene_name [3]
gene_name id
<fctr> <dbl>
1 AADACL2 79.0524
2 AADACL4 65.1106
3 AADACL3 62.0098