I am trying to plot nice stacked percent barchart
using ggplot2
. I've read some material and almost manage to plot, what I want. Also, I enclose the material, it might be useful in one place:
我正在尝试使用ggplot2绘制好的堆积百分比条形图。我已经阅读了一些材料,几乎可以设法绘制,我想要什么。另外,我附上材料,它可能在一个地方有用:
How do I label a stacked bar chart in ggplot2 without creating a summary data frame?
如何在不创建摘要数据框的情况下在ggplot2中标记堆积条形图?
Create stacked barplot where each stack is scaled to sum to 100%
创建堆叠条形图,其中每个堆栈按比例缩放为100%
R stacked percentage bar plot with percentage of binary factor and labels (with ggplot)
R堆积百分比条形图,包含二元因子和标签的百分比(使用ggplot)
My problem is that I can't place labels
where I want - in the middle of the bars.
我的问题是我不能把标签放在我想要的地方 - 在酒吧的中间。
You can see the problem in the picture above - labels looks awfull and also overlap each other.
您可以在上面的图片中看到问题 - 标签看起来很糟糕并且彼此重叠。
What I am looking for right now is:
我现在正在寻找的是:
-
How to place labels in the midde of the bars (areas)
如何在酒吧(区域)的中间放置标签
-
How to plot not all the labels, but for example which are greather than 10%?
如何绘制不是所有标签,但例如哪些标签超过10%?
-
How to solve overlaping problem?
如何解决重叠问题?
For the Q 1.
@MikeWise suggested possible solution. However, I still can't deal with this problem.
对于Q 1. @MikeWise提出了可行的解决方案。但是,我仍然无法处理这个问题。
Also, I enclose reproducible example, how I've plotted this grahp.
另外,我附上了可重复的例子,我是如何绘制这个grahp的。
library('plyr')
library('ggplot2')
library('scales')
set.seed(1992)
n=68
Category <- sample(c("Black", "Red", "Blue", "Cyna", "Purple"), n, replace = TRUE, prob = NULL)
Brand <- sample("Brand", n, replace = TRUE, prob = NULL)
Brand <- paste0(Brand, sample(1:5, n, replace = TRUE, prob = NULL))
USD <- abs(rnorm(n))*100
df <- data.frame(Category, Brand, USD)
# Calculate the percentages
df = ddply(df, .(Brand), transform, percent = USD/sum(USD) * 100)
# Format the labels and calculate their positions
df = ddply(df, .(Brand), transform, pos = (cumsum(USD) - 0.5 * USD))
#create nice labes
df$label = paste0(sprintf("%.0f", df$percent), "%")
ggplot(df, aes(x=reorder(Brand,USD,
function(x)+sum(x)), y=percent, fill=Category))+
geom_bar(position = "fill", stat='identity', width = .7)+
geom_text(aes(label=label, ymax=100, ymin=0), vjust=0, hjust=0,color = "white", position=position_fill())+
coord_flip()+
scale_y_continuous(labels = percent_format())+
ylab("")+
xlab("")
3 个解决方案
#1
27
Here's how to center the labels and avoid plotting labels for small percentages. An additional issue in your data is that you have multiple bar sections for each colour. Instead, it seems to me all the bar sections of a given colour should be combined. The code below uses dplyr
instead of plyr
to set up the data for plotting:
以下是如何使标签居中并避免为小百分比绘制标签。数据中的另一个问题是每种颜色都有多个条形部分。相反,在我看来,应该结合给定颜色的所有条形部分。下面的代码使用dplyr而不是plyr来设置绘图数据:
library(dplyr)
# Initial data frame
df <- data.frame(Category, Brand, USD)
# Calculate percentages and label positions
df.summary = df %>% group_by(Brand, Category) %>%
summarise(USD = sum(USD)) %>% # Within each Brand, sum all values in each Category
mutate(percent = USD/sum(USD),
pos = cumsum(percent) - 0.5*percent)
To plot the data, use an ifelse
statement to determine whether a label is plotted or not. In this case, I've avoided plotting a label for percentages less than 7%.
要绘制数据,请使用ifelse语句确定是否绘制标签。在这种情况下,我避免为百分比小于7%绘制标签。
ggplot(df.summary, aes(x=reorder(Brand,USD,function(x)+sum(x)), y=percent, fill=Category)) +
geom_bar(stat='identity', width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(percent >= 0.07, paste0(sprintf("%.0f", percent*100),"%"),""),
y=pos), colour="white") +
coord_flip() +
scale_y_continuous(labels = percent_format()) +
labs(y="", x="")
UPDATE: With ggplot2 version 2, it is no longer necessary to calculate the coordinates of the text labels to get them centered. Instead, you can use position=position_stack(vjust=0.5)
. For example:
更新:使用ggplot2版本2,不再需要计算文本标签的坐标以使它们居中。相反,您可以使用position = position_stack(vjust = 0.5)。例如:
ggplot(df.summary, aes(x=reorder(Brand, USD, sum), y=percent, fill=Category)) +
geom_bar(stat="identity", width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(percent >= 0.07, paste0(sprintf("%.0f", percent*100),"%"),"")),
position=position_stack(vjust=0.5), colour="white") +
coord_flip() +
scale_y_continuous(labels = percent_format()) +
labs(y="", x="")
#2
1
I followed the example and found the way how to put nice labels for simple stacked barchart. I think it might be usefull too.
我按照这个例子找到了如何为简单的堆叠条形图放置好标签的方法。我认为它也可能有用。
df <- data.frame(Category, Brand, USD)
# Calculate percentages and label positions
df.summary = df %>% group_by(Brand, Category) %>%
summarise(USD = sum(USD)) %>% # Within each Brand, sum all values in each Category
mutate( pos = cumsum(USD)-0.5*USD)
ggplot(df.summary, aes(x=reorder(Brand,USD,function(x)+sum(x)), y=USD, fill=Category)) +
geom_bar(stat='identity', width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(USD>100,round(USD,0),""),
y=pos), colour="white") +
coord_flip()+
labs(y="", x="")
#3
#1
27
Here's how to center the labels and avoid plotting labels for small percentages. An additional issue in your data is that you have multiple bar sections for each colour. Instead, it seems to me all the bar sections of a given colour should be combined. The code below uses dplyr
instead of plyr
to set up the data for plotting:
以下是如何使标签居中并避免为小百分比绘制标签。数据中的另一个问题是每种颜色都有多个条形部分。相反,在我看来,应该结合给定颜色的所有条形部分。下面的代码使用dplyr而不是plyr来设置绘图数据:
library(dplyr)
# Initial data frame
df <- data.frame(Category, Brand, USD)
# Calculate percentages and label positions
df.summary = df %>% group_by(Brand, Category) %>%
summarise(USD = sum(USD)) %>% # Within each Brand, sum all values in each Category
mutate(percent = USD/sum(USD),
pos = cumsum(percent) - 0.5*percent)
To plot the data, use an ifelse
statement to determine whether a label is plotted or not. In this case, I've avoided plotting a label for percentages less than 7%.
要绘制数据,请使用ifelse语句确定是否绘制标签。在这种情况下,我避免为百分比小于7%绘制标签。
ggplot(df.summary, aes(x=reorder(Brand,USD,function(x)+sum(x)), y=percent, fill=Category)) +
geom_bar(stat='identity', width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(percent >= 0.07, paste0(sprintf("%.0f", percent*100),"%"),""),
y=pos), colour="white") +
coord_flip() +
scale_y_continuous(labels = percent_format()) +
labs(y="", x="")
UPDATE: With ggplot2 version 2, it is no longer necessary to calculate the coordinates of the text labels to get them centered. Instead, you can use position=position_stack(vjust=0.5)
. For example:
更新:使用ggplot2版本2,不再需要计算文本标签的坐标以使它们居中。相反,您可以使用position = position_stack(vjust = 0.5)。例如:
ggplot(df.summary, aes(x=reorder(Brand, USD, sum), y=percent, fill=Category)) +
geom_bar(stat="identity", width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(percent >= 0.07, paste0(sprintf("%.0f", percent*100),"%"),"")),
position=position_stack(vjust=0.5), colour="white") +
coord_flip() +
scale_y_continuous(labels = percent_format()) +
labs(y="", x="")
#2
1
I followed the example and found the way how to put nice labels for simple stacked barchart. I think it might be usefull too.
我按照这个例子找到了如何为简单的堆叠条形图放置好标签的方法。我认为它也可能有用。
df <- data.frame(Category, Brand, USD)
# Calculate percentages and label positions
df.summary = df %>% group_by(Brand, Category) %>%
summarise(USD = sum(USD)) %>% # Within each Brand, sum all values in each Category
mutate( pos = cumsum(USD)-0.5*USD)
ggplot(df.summary, aes(x=reorder(Brand,USD,function(x)+sum(x)), y=USD, fill=Category)) +
geom_bar(stat='identity', width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(USD>100,round(USD,0),""),
y=pos), colour="white") +
coord_flip()+
labs(y="", x="")