I've started with R and I'm still finding my way with syntax. I'm looking to get the frequencies for a scaled variable which has values of 0 through 10 and NA.
我从R开始,我还在用语法找到我的方法。我希望得到一个缩放变量的频率它的值是0到10和NA。
Id <- c(1,2,3,4,5)
ClassA <- c(1,NA,3,1,1)
ClassB <- c(2,1,1,3,3)
R <- c(5,5,7,NA,9)
S <- c(3,7,NA,9,5)
df <- data.frame(Id,ClassA,ClassB,R,S)
library(plyr)
count(df,'R')
I get a result of
我得到一个结果
R freq
1 5 2
2 7 1
3 9 1
4 NA 1
I'm looking for a result of
我在找一个结果
R freq
1 0 0
2 1 0
3 2 0
4 3 0
5 4 0
6 5 2
7 6 0
8 7 1
9 8 0
10 9 1
11 10 0
12 NA 1
If I have the vector showing the possible results
如果我有向量显示可能的结果
RAnswers <- c(0,1,2,3,4,5,6,7,8,9,10,NA)
How would I apply it with the data set to get the above result?
如何将其应用于数据集以获得上述结果?
2 个解决方案
#1
1
Here's a base R solution built around table()
, match()
, and replace()
:
这里有一个基于表()、match()和replace()的基本R解决方案:
freq <- table(df$R,useNA='ifany');
freq;
##
## 5 7 9 <NA>
## 2 1 1 1
R <- c(0:10,NA);
df2 <- data.frame(R=R,freq=freq[match(R,as.integer(names(freq)))]);
df2$freq[is.na(df2$freq)] <- 0;
df2;
## R freq
## 1 0 0
## 2 1 0
## 3 2 0
## 4 3 0
## 5 4 0
## 6 5 2
## 7 6 0
## 8 7 1
## 9 8 0
## 10 9 1
## 11 10 0
## 12 NA 1
Edit: Frank has a better answer, here's how you can use table()
on a factor to get the required output:
编辑:Frank有一个更好的答案,下面是如何使用表()中的一个因子来获得所需的输出:
setNames(nm=c('R','freq'),data.frame(table(factor(df$R,levels=RAnswers,exclude=NULL))));
## R freq
## 1 0 0
## 2 1 0
## 3 2 0
## 4 3 0
## 5 4 0
## 6 5 2
## 7 6 0
## 8 7 1
## 9 8 0
## 10 9 1
## 11 10 0
## 12 <NA> 1
#2
1
This kind of tasks is easily done with package dplyr. For keeping the non-used values of R, you have to define R as factor and use tidyr's complete-function
这种任务很容易用dplyr包完成。为了保持不使用的R值,必须将R定义为因子并使用tidyr的完备函数
library(dplyr)
library(tidyr)
df %>%
mutate(R = factor(R, levels=1:10)) %>%
group_by(R) %>%
summarise(freq=n()) %>%
complete(R, fill=list(freq=0))
#1
1
Here's a base R solution built around table()
, match()
, and replace()
:
这里有一个基于表()、match()和replace()的基本R解决方案:
freq <- table(df$R,useNA='ifany');
freq;
##
## 5 7 9 <NA>
## 2 1 1 1
R <- c(0:10,NA);
df2 <- data.frame(R=R,freq=freq[match(R,as.integer(names(freq)))]);
df2$freq[is.na(df2$freq)] <- 0;
df2;
## R freq
## 1 0 0
## 2 1 0
## 3 2 0
## 4 3 0
## 5 4 0
## 6 5 2
## 7 6 0
## 8 7 1
## 9 8 0
## 10 9 1
## 11 10 0
## 12 NA 1
Edit: Frank has a better answer, here's how you can use table()
on a factor to get the required output:
编辑:Frank有一个更好的答案,下面是如何使用表()中的一个因子来获得所需的输出:
setNames(nm=c('R','freq'),data.frame(table(factor(df$R,levels=RAnswers,exclude=NULL))));
## R freq
## 1 0 0
## 2 1 0
## 3 2 0
## 4 3 0
## 5 4 0
## 6 5 2
## 7 6 0
## 8 7 1
## 9 8 0
## 10 9 1
## 11 10 0
## 12 <NA> 1
#2
1
This kind of tasks is easily done with package dplyr. For keeping the non-used values of R, you have to define R as factor and use tidyr's complete-function
这种任务很容易用dplyr包完成。为了保持不使用的R值,必须将R定义为因子并使用tidyr的完备函数
library(dplyr)
library(tidyr)
df %>%
mutate(R = factor(R, levels=1:10)) %>%
group_by(R) %>%
summarise(freq=n()) %>%
complete(R, fill=list(freq=0))