I have a data.frame as below. I would like to get a list of cells that dont have even a single number or a-to-z and their frequency. How could I do that? In case of below data I want a table. In the table's first column i will have * and . These second column will show frequency of those values (1 and 2 respectively). "a*" and "21.9" wont appear because they contain at least one number or a-z
我有一个data.frame如下。我想获得一个甚至没有单个数字或a-to-z及其频率的单元格列表。我怎么能这样做?如果是以下数据,我想要一张桌子。在表的第一列中,我将有*和。这些第二列将显示这些值的频率(分别为1和2)。 “a *”和“21.9”不会出现,因为它们至少包含一个数字或a-z
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9),ncol=3,byrow=TRUE)
smdf<-as.data.frame(sm)
1 个解决方案
#1
1
Does this provide what you are looking for?
这是否能满足您的需求?
require(plyr)
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9),ncol=3,byrow=TRUE)
count(sm[!grepl("[[:alnum:]]", sm)])
x freq
1 * 1
2 . 2
If you want to also exclude the NA and spaces, you can easily just add the appropriate conditions to the filter. As a side note, I am fairly certain a more elegant regex could solve this without the extra parameters but my regex skills are in progress. Will update if I manage to figure out such a thing.
如果您还要排除NA和空格,您可以轻松地向过滤器添加适当的条件。作为旁注,我相当肯定一个更优雅的正则表达式可以解决这个没有额外的参数,但我的正则表达式技能正在进行中。如果我设法找出这样的事情,将会更新。
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9, " ", NA, 13),ncol=3,byrow=TRUE)
count(sm[!grepl("[[:alnum:]]", sm) & !is.na(sm) & sm != " "])
x freq
1 * 1
2 . 2
However, if there is a specific list of characters you wish to count you can always make a vector of the characters and count only those. This doesn't require the extra 'space' and 'NA' components.
但是,如果您希望计算一个特定的字符列表,则可以始终创建字符向量并仅计算这些字符。这不需要额外的“空间”和“NA”组件。
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9, " ", NA, 13),ncol=3,byrow=TRUE)
x <- unlist(strsplit("*~!@#$%^&(){}_+:\"<>?,./;'[]-=", split=""))
count(sm[sm %in% x])
x freq
1 * 1
2 . 2
#1
1
Does this provide what you are looking for?
这是否能满足您的需求?
require(plyr)
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9),ncol=3,byrow=TRUE)
count(sm[!grepl("[[:alnum:]]", sm)])
x freq
1 * 1
2 . 2
If you want to also exclude the NA and spaces, you can easily just add the appropriate conditions to the filter. As a side note, I am fairly certain a more elegant regex could solve this without the extra parameters but my regex skills are in progress. Will update if I manage to figure out such a thing.
如果您还要排除NA和空格,您可以轻松地向过滤器添加适当的条件。作为旁注,我相当肯定一个更优雅的正则表达式可以解决这个没有额外的参数,但我的正则表达式技能正在进行中。如果我设法找出这样的事情,将会更新。
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9, " ", NA, 13),ncol=3,byrow=TRUE)
count(sm[!grepl("[[:alnum:]]", sm) & !is.na(sm) & sm != " "])
x freq
1 * 1
2 . 2
However, if there is a specific list of characters you wish to count you can always make a vector of the characters and count only those. This doesn't require the extra 'space' and 'NA' components.
但是,如果您希望计算一个特定的字符列表,则可以始终创建字符向量并仅计算这些字符。这不需要额外的“空间”和“NA”组件。
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9, " ", NA, 13),ncol=3,byrow=TRUE)
x <- unlist(strsplit("*~!@#$%^&(){}_+:\"<>?,./;'[]-=", split=""))
count(sm[sm %in% x])
x freq
1 * 1
2 . 2