I would like to create a histogram where the y-axis shows the percentage per facet in ggplot2. I have seen several similar questions but some answers seem outdated or they show the percentage of all observations rather than per facet.
我想创建一个直方图,其中y轴显示ggplot2中每个面的百分比。我已经看到了几个类似的问题,但是一些答案似乎已经过时,或者它们显示了所有观察的百分比,而不是每个方面。
I tried this:
我试过这个:
library(ggplot2)
library(scales)
ggplot(mtcars, aes(mpg))+
facet_grid(cyl ~ am)+
stat_count(aes(y=..prop..)) +
theme_bw()+
scale_y_continuous(labels = percent_format())
Which seems to work, except that the binwidth is not fixed. Facets with few observations have large bars.
这似乎有效,除了binwidth不固定。观察很少的方面有很大的条形。
How could I fix the binwidth?
我怎么能修复binwidth?
EDIT: Solution adapted from ACNB I overlooked that before and I just saw that Andrey Kolyadin was quicker to provide a more concise solution.
编辑:改编自ACNB的解决方案我之前忽略了这一点,我刚看到Andrey Kolyadin更快地提供更简洁的解决方案。
binwidth <- 1
mtcars.stats <- mtcars %>%
group_by(cyl, am) %>%
mutate(bin = cut(mpg, breaks=seq(0,35, binwidth),
labels = seq(0 + binwidth, 35, binwidth)-(binwidth/2)),
n = n()) %>%
group_by(cyl, am, bin) %>%
summarise(p = n()/n[1]) %>%
ungroup() %>%
mutate(bin = as.numeric(as.character(bin)))
ggplot(mtcars.stats, aes(x = bin, y= p)) +
geom_col() +
facet_grid(cyl~am)+
theme_bw()+
scale_y_continuous(labels = percent_format())
2 个解决方案
#1
3
As alway I advice not to rely on statistics layer of ggplot2
and calculate necessary statistics before plotting:
总之,我建议不要依赖ggplot2的统计层并在绘图前计算必要的统计数据:
library('zoo')
library('tidyverse')
# Selecting breaks
breaks <- seq.int(min(mtcars$mpg), max(mtcars$mpg), length.out = 19)
# Calculating densities
mt_hist <- mtcars %>%
group_by(cyl, am) %>%
summarise(x = list(rollmean(breaks, 2)),
count = list(hist(mpg, breaks = breaks, plot = FALSE)$counts)) %>%
unnest() %>%
group_by(cyl, am) %>%
mutate(count = count/sum(count))
And plot itself:
绘图本身:
ggplot(mt_hist)+
aes(x = x,
y = count)+
geom_col()+
facet_grid(cyl ~ am)+
theme_bw()+
scale_y_continuous(labels = percent_format())
#2
1
have you tried adding the geom_histogram
and stat
argument, something like ...
你有没有试过添加geom_histogram和stat参数,比如...
p <- ggplot(mtcars, aes(mpg))
p <- p + geom_histogram(stat = 'bin')
p <- p + facet_grid(cyl ~ am)
p <- p + stat_count(aes(y=..prop..))
p <- p + theme_bw()
p <- p + scale_y_continuous(labels = percent_format())
p
#1
3
As alway I advice not to rely on statistics layer of ggplot2
and calculate necessary statistics before plotting:
总之,我建议不要依赖ggplot2的统计层并在绘图前计算必要的统计数据:
library('zoo')
library('tidyverse')
# Selecting breaks
breaks <- seq.int(min(mtcars$mpg), max(mtcars$mpg), length.out = 19)
# Calculating densities
mt_hist <- mtcars %>%
group_by(cyl, am) %>%
summarise(x = list(rollmean(breaks, 2)),
count = list(hist(mpg, breaks = breaks, plot = FALSE)$counts)) %>%
unnest() %>%
group_by(cyl, am) %>%
mutate(count = count/sum(count))
And plot itself:
绘图本身:
ggplot(mt_hist)+
aes(x = x,
y = count)+
geom_col()+
facet_grid(cyl ~ am)+
theme_bw()+
scale_y_continuous(labels = percent_format())
#2
1
have you tried adding the geom_histogram
and stat
argument, something like ...
你有没有试过添加geom_histogram和stat参数,比如...
p <- ggplot(mtcars, aes(mpg))
p <- p + geom_histogram(stat = 'bin')
p <- p + facet_grid(cyl ~ am)
p <- p + stat_count(aes(y=..prop..))
p <- p + theme_bw()
p <- p + scale_y_continuous(labels = percent_format())
p