如何在R中应用rowSums()以根据rowsum值选择前n行?

时间:2021-12-07 22:55:20

I am reading my data from a csv file. I want to sum over rows of the read data, then I want to sort them on the basis of rowsum values. Now, I want to select number of rows on the basis of specified threshold on rowsum value. I gave a try on tempdata.csv, which contains following data:

我正在从csv文件中读取数据。我想总结读取数据的行,然后我想根据rowsum值对它们进行排序。现在,我想根据rowsum值的指定阈值选择行数。我试了一下tempdata.csv,其中包含以下数据:

>data <- read.csv("tempdata.csv")
>data

        X Doc1 Doc2 Doc3 Doc4
1    book    2    0    2    1
2   table    0    2    0    1
3    room    0    2    0    0
4   chair    0    0    2    0
5 speaker    0    0    0    0

>m <- data.matrix(data[2:length(data)], rownames.force=NA)
>(dimnames(m)[[1]] <- data[,1])
>rs1 <- rowSums(m, na.rm = FALSE)

Now I don't know how to combine rowsum values to the matrix 'm'. I am very new in R, I am not able write the optimized code to achieve this. Please help me, thanks in advance.

现在我不知道如何将rowum值组合到矩阵'm'。我是R的新手,我无法编写优化的代码来实现这一目标。请帮助我,提前谢谢。

2 个解决方案

#1


1  

This will sort the data.frame or data.matrix by rowSums

这将按rowSums对data.frame或data.matrix进行排序

m[sort(rowSums(m), index=T, decreasing=TRUE)$ix, ]

If you only want the rows that meet a threshold you don't need to sort

如果您只想要符合阈值的行,则无需进行排序

m[rowSums(m) > threshold, ]

If you want to add a column containing the rowSum values

如果要添加包含rowSum值的列

m <- cbind(m, rowSums(m))

#2


0  

Thank you @6pool for your answer. I used following code to achieve the goal.

谢谢@ 6pool的回答。我使用以下代码来实现目标。

data <- read.csv("tiny.csv")
data2 <- data[, 2:length(data)]
data2 <- transform(data2, sum=rowSums(data2))
(dimnames(data2)[[1]] <- data[,1])
data3 <- data2[order(-data2$sum),]
### specify the threshold to select the number of rows
threshold = 3
(data4 <- data3[data3$sum>= threshold, ])

#1


1  

This will sort the data.frame or data.matrix by rowSums

这将按rowSums对data.frame或data.matrix进行排序

m[sort(rowSums(m), index=T, decreasing=TRUE)$ix, ]

If you only want the rows that meet a threshold you don't need to sort

如果您只想要符合阈值的行,则无需进行排序

m[rowSums(m) > threshold, ]

If you want to add a column containing the rowSum values

如果要添加包含rowSum值的列

m <- cbind(m, rowSums(m))

#2


0  

Thank you @6pool for your answer. I used following code to achieve the goal.

谢谢@ 6pool的回答。我使用以下代码来实现目标。

data <- read.csv("tiny.csv")
data2 <- data[, 2:length(data)]
data2 <- transform(data2, sum=rowSums(data2))
(dimnames(data2)[[1]] <- data[,1])
data3 <- data2[order(-data2$sum),]
### specify the threshold to select the number of rows
threshold = 3
(data4 <- data3[data3$sum>= threshold, ])