如何使用R中的数据值或百分比标记直方图条

时间:2022-02-15 14:58:15

I'd like to label each bar of a histogram with either the number of counts in that bin or the percent of total counts that are in that bin. I'm sure there must be a way to do this, but I haven't been able to find it. This page has a couple of pictures of SAS histograms that do basically what I'm trying to do (but the site doesn't seem to have R versions): http://www.ats.ucla.edu/stat/sas/faq/histogram_anno.htm

我想用直方图中的每个条形标记该区域中的计数数量或该区域中的总计数百分比。我敢肯定必须有办法做到这一点,但我一直无法找到它。这个页面有几张SAS直方图的照片,基本上我正在尝试做的事情(但该网站似乎没有R版本):http://www.ats.ucla.edu/stat/sas/常见问题/ histogram_anno.htm

If possible, it would also be nice to have the flexibility to put the labels above or somewhere inside the bars, as desired.

如果可能的话,根据需要,可以灵活地将标签放在条形图的上方或某处。

I'm trying to do this with the base R plotting facilities, but I'd be interested in methods to do this in ggplot2 and lattice as well.

我正在尝试使用基础R绘图工具,但我对ggplot2和格子中的方法感兴趣。

2 个解决方案

#1


32  

To include the number of counts, you can just set labels=TRUE.

要包括计数数量,您只需设置labels = TRUE即可。

The example below is just slightly adapted from one on the hist() help page:

以下示例仅稍微改编自hist()帮助页面上的示例:

hist(islands, col="gray", labels = TRUE, ylim=c(0, 45))

如何使用R中的数据值或百分比标记直方图条

Getting percentages is a bit more involved. The only way I know to do that it to directly manipulate the object returned by a call to hist(), as described in a bit more detail in my answer to this similar question:

获得百分比涉及更多。我知道这样做的唯一方法就是直接操作对hist()的调用返回的对象,如我对这个类似问题的回答中的更详细描述:

histPercent <- function(x, ...) {
   H <- hist(x, plot = FALSE)
   H$density <- with(H, 100 * density* diff(breaks)[1])
   labs <- paste(round(H$density), "%", sep="")
   plot(H, freq = FALSE, labels = labs, ylim=c(0, 1.08*max(H$density)),...)
}

histPercent(islands, col="gray")

如何使用R中的数据值或百分比标记直方图条

#2


5  

Adding numbers at the tops of the bars in barplots or histograms distorts the visual interpretation of the bars, even putting the labels inside of the bars near the top creates a fuzzy top effect that makes it harder for the viewer to properly interpret the graph. If the number are of interest then this creates a poorly laid out table, why not just create a proper table.

在条形图或柱状图中在条形顶部添加数字会扭曲条形图的视觉解释,即使将条形图放在靠近顶部的条形图内也会产生模糊的顶部效果,使观察者更难以正确地解释图形。如果这个数字是有意义的,那么这会创建一个布局不佳的表,为什么不创建一个合适的表。

If you really feel the need to add the numbers then it is better to put them below the bars or along the top margin so that they line up better for easier comparison and don't interfere with the visual interpretation of the graph. Labels can be added to base graphs using the text or mtext functions and the x locations can be found in the return value from the hist function. Heights for plotting can be computed using the grconvertY function.

如果您真的觉得需要添加数字,那么最好将它们放在条形下方或沿着上边距,以便它们更好地排列以便于比较,并且不会干扰图形的视觉解释。可以使用text或mtext函数将标签添加到基本图形,并且可以在hist函数的返回值中找到x位置。可以使用grconvertY函数计算绘图高度。

#1


32  

To include the number of counts, you can just set labels=TRUE.

要包括计数数量,您只需设置labels = TRUE即可。

The example below is just slightly adapted from one on the hist() help page:

以下示例仅稍微改编自hist()帮助页面上的示例:

hist(islands, col="gray", labels = TRUE, ylim=c(0, 45))

如何使用R中的数据值或百分比标记直方图条

Getting percentages is a bit more involved. The only way I know to do that it to directly manipulate the object returned by a call to hist(), as described in a bit more detail in my answer to this similar question:

获得百分比涉及更多。我知道这样做的唯一方法就是直接操作对hist()的调用返回的对象,如我对这个类似问题的回答中的更详细描述:

histPercent <- function(x, ...) {
   H <- hist(x, plot = FALSE)
   H$density <- with(H, 100 * density* diff(breaks)[1])
   labs <- paste(round(H$density), "%", sep="")
   plot(H, freq = FALSE, labels = labs, ylim=c(0, 1.08*max(H$density)),...)
}

histPercent(islands, col="gray")

如何使用R中的数据值或百分比标记直方图条

#2


5  

Adding numbers at the tops of the bars in barplots or histograms distorts the visual interpretation of the bars, even putting the labels inside of the bars near the top creates a fuzzy top effect that makes it harder for the viewer to properly interpret the graph. If the number are of interest then this creates a poorly laid out table, why not just create a proper table.

在条形图或柱状图中在条形顶部添加数字会扭曲条形图的视觉解释,即使将条形图放在靠近顶部的条形图内也会产生模糊的顶部效果,使观察者更难以正确地解释图形。如果这个数字是有意义的,那么这会创建一个布局不佳的表,为什么不创建一个合适的表。

If you really feel the need to add the numbers then it is better to put them below the bars or along the top margin so that they line up better for easier comparison and don't interfere with the visual interpretation of the graph. Labels can be added to base graphs using the text or mtext functions and the x locations can be found in the return value from the hist function. Heights for plotting can be computed using the grconvertY function.

如果您真的觉得需要添加数字,那么最好将它们放在条形下方或沿着上边距,以便它们更好地排列以便于比较,并且不会干扰图形的视觉解释。可以使用text或mtext函数将标签添加到基本图形,并且可以在hist函数的返回值中找到x位置。可以使用grconvertY函数计算绘图高度。