I would like to create an automated knitr report that will produce histograms for each numeric field within my dataframe. My goal is to do this without having to specify the actual fields (this dataset contains over 70 and I would also like to reuse the script).
我想创建一个自动的knitr报告,它将为我的dataframe中的每个数字字段生成直方图。我的目标是这样做,而不必指定实际的字段(这个数据集包含超过70个字段,我也希望重用这个脚本)。
I've tried a few different approaches:
我尝试了几种不同的方法:
- saving the plot to an object,
p
, and then callingp
after the loop- This only plots the final plot
- 这只是最后的情节
- 将图保存到一个对象p中,然后在循环之后调用p这只会绘制最终的图
- Creating an array of plots,
PLOTS <- NULL
, and appending the plots within the loopPLOTS <- append(PLOTS, p)
- Accessing these plots out of the loop did not work at all
- 从循环中访问这些图根本不起作用
- 创建一个情节数组,情节<- NULL,并在循环情节<- append(情节,p)中添加情节,在循环之外访问这些情节,根本不起作用
- Even tried saving each to a
.png
file but would rather not have to deal with the overhead of saving and then re-accessing each file - 甚至尝试将每个文件保存到.png文件中,但是不希望处理保存和重新访问每个文件的开销
I'm afraid the intricacies of the plot devices are escaping me.
我担心情节装置的复杂之处正在从我眼前消失。
Question
How can I make the following chunk output each plot within the loop to the report? Currently, the best I can achieve is output of the final plot produced by saving it to an object and calling that object outside of the loop.
如何使下面的块输出到报表循环中的每个图?目前,我所能达到的最佳效果是通过保存到对象并在循环外部调用该对象而产生的最终图的输出。
R markdown chunk using knitr
in RStudio:
R markdown大块使用knitr在RStudio:
```{r plotNumeric, echo=TRUE, fig.height=3}
suppressPackageStartupMessages(library(ggplot2))
FIELDS <- names(df)[sapply(df, class)=="numeric"]
for (field in FIELDS){
qplot(df[,field], main=field)
}
```
From this point, I hope to customize the plots further.
从这一点上,我希望能进一步定制情节。
3 个解决方案
#1
31
Wrap the qplot
in print
.
用印刷体包装qplot。
knitr
will do that for you if the qplot
is outside a loop, but (at least the version I have installed) doesn't detect this inside the loop (which is consistent with the behaviour of the R command line).
如果qplot不在一个循环中,knitr将为您做这一点,但是(至少我已经安装了这个版本)在循环中不会检测到它(这与R命令行的行为一致)。
#2
7
I am using child Rmd files in markdown, also works in sweave.
我在markdown使用儿童Rmd文件,也在sweave工作。
in Rmd use following snippet:
在Rmd中使用以下代码片段:
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
```
da-numeric.Rmd looks like:
da-numeric。限制型心肌病的样子:
Variabele `r num_var_names[i]`
------------------------------------
Missing : `r sum(is.na(data[[num_var_names[i]]]))`
Minimum value : `r min(na.omit(data[[num_var_names[i]]]))`
Percentile 1 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2]`
Percentile 99 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]`
Maximum value : `r max(na.omit(data[[num_var_names[i]]]))`
```{r results='asis', comment="" }
warn_extreme_values=3
d1 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[1]
d99 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[101] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]
if(d1){cat('Warning : Suspect extreme values in left tail')}
if(d99){cat('Warning : Suspect extreme values in right tail')}
```
``` {r eval=TRUE, fig.width=6, fig.height=2}
library(ggplot2)
v <- num_var_names[i]
hp <- ggplot(na.omit(data), aes_string(x=v)) + geom_histogram( colour="grey", fill="grey", binwidth=diff(range(na.omit(data[[v]]))/100))
hp + theme(axis.title.x = element_blank(),axis.text.x = element_text(size=10)) + theme(axis.title.y = element_blank(),axis.text.y = element_text(size=10))
```
see my datamineR package on github https://github.com/hugokoopmans/dataMineR
在github上查看我的datamineR包(https://github.com/hugokoopmans/dataMineR)
#3
2
As an addition to Hugo's excellent answer, I believe that in 2016 you need to include a print
command as well:
作为Hugo的优秀答案的补充,我相信在2016年你还需要包括一个print命令:
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
`r paste(out, collapse = '\n')`
```
#1
31
Wrap the qplot
in print
.
用印刷体包装qplot。
knitr
will do that for you if the qplot
is outside a loop, but (at least the version I have installed) doesn't detect this inside the loop (which is consistent with the behaviour of the R command line).
如果qplot不在一个循环中,knitr将为您做这一点,但是(至少我已经安装了这个版本)在循环中不会检测到它(这与R命令行的行为一致)。
#2
7
I am using child Rmd files in markdown, also works in sweave.
我在markdown使用儿童Rmd文件,也在sweave工作。
in Rmd use following snippet:
在Rmd中使用以下代码片段:
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
```
da-numeric.Rmd looks like:
da-numeric。限制型心肌病的样子:
Variabele `r num_var_names[i]`
------------------------------------
Missing : `r sum(is.na(data[[num_var_names[i]]]))`
Minimum value : `r min(na.omit(data[[num_var_names[i]]]))`
Percentile 1 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2]`
Percentile 99 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]`
Maximum value : `r max(na.omit(data[[num_var_names[i]]]))`
```{r results='asis', comment="" }
warn_extreme_values=3
d1 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[1]
d99 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[101] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]
if(d1){cat('Warning : Suspect extreme values in left tail')}
if(d99){cat('Warning : Suspect extreme values in right tail')}
```
``` {r eval=TRUE, fig.width=6, fig.height=2}
library(ggplot2)
v <- num_var_names[i]
hp <- ggplot(na.omit(data), aes_string(x=v)) + geom_histogram( colour="grey", fill="grey", binwidth=diff(range(na.omit(data[[v]]))/100))
hp + theme(axis.title.x = element_blank(),axis.text.x = element_text(size=10)) + theme(axis.title.y = element_blank(),axis.text.y = element_text(size=10))
```
see my datamineR package on github https://github.com/hugokoopmans/dataMineR
在github上查看我的datamineR包(https://github.com/hugokoopmans/dataMineR)
#3
2
As an addition to Hugo's excellent answer, I believe that in 2016 you need to include a print
command as well:
作为Hugo的优秀答案的补充,我相信在2016年你还需要包括一个print命令:
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
`r paste(out, collapse = '\n')`
```