From a data frame I want to plot a pie chart for five categories with their percentages as labels in the same graph in order from highest to lowest, going clockwise.
从一个数据框架,我想画一个饼状图,以五个类别的百分比作为标签,从最高到最低,顺时针。
My code is:
我的代码是:
League<-c("A","B","A","C","D","E","A","E","D","A","D")
data<-data.frame(League) # I have more variables
p<-ggplot(data,aes(x="",fill=League))
p<-p+geom_bar(width=1)
p<-p+coord_polar(theta="y")
p<-p+geom_text(data,aes(y=cumsum(sort(table(data)))-0.5*sort(table(data)),label=paste(as.character(round(sort(table(data))/sum(table(data)),2)),rep("%",5),sep="")))
p
I use
我使用
cumsum(sort(table(data)))-0.5*sort(table(data))
to place the label in the corresponding portion and
将标签放置在相应的部分中。
label=paste(as.character(round(sort(table(data))/sum(table(data)),2)),rep("%",5),sep="")
for the labels which is the percentages.
标签是百分比。
I get the following output:
我得到如下输出:
Error: ggplot2 doesn't know how to deal with data of class uneval
3 个解决方案
#1
9
I've preserved most of your code. I found this pretty easy to debug by leaving out the coord_polar
... easier to see what's going on as a bar graph.
我保存了你的大部分代码。我发现这个很容易调试,省去了coord_polar…更容易看出什么是条形图。
The main thing was to reorder the factor from highest to lowest to get the plotting order correct, then just playing with the label positions to get them right. I also simplified your code for the labels (you don't need the as.character
or the rep
, and paste0
is a shortcut for sep = ""
.)
最主要的是要从最高到最低的顺序重新排序,以使绘图顺序正确,然后只需要在标签位置上进行调整就可以正确地完成。我还简化了标签的代码(您不需要as。字符或代表,而paste0是sep = ""的快捷方式。
League<-c("A","B","A","C","D","E","A","E","D","A","D")
data<-data.frame(League) # I have more variables
data$League <- reorder(data$League, X = data$League, FUN = function(x) -length(x))
at <- nrow(data) - as.numeric(cumsum(sort(table(data)))-0.5*sort(table(data)))
label=paste0(round(sort(table(data))/sum(table(data)),2) * 100,"%")
p <- ggplot(data,aes(x="", fill = League,fill=League)) +
geom_bar(width = 1) +
coord_polar(theta="y") +
annotate(geom = "text", y = at, x = 1, label = label)
p
The at
calculation is finding the centers of the wedges. (It's easier to think of them as the centers of bars in a stacked bar plot, just run the above plot without the coord_polar
line to see.) The at
calculation can be broken out as follows:
在计算时,找到了楔形的中心。(更容易把它们想象成堆叠的条形图中钢筋的中心,只是运行上面的图,没有coord_polar线来查看。)计算结果如下:
table(data)
is the number of rows in each group, and sort(table(data))
puts them in the order they'll be plotted. Taking the cumsum()
of that gives us the edges of each bar when stacked on top of each other, and multiplying by 0.5 gives us the half the heights of each bar in the stack (or half the widths of the wedges of the pie).
表(数据)是每个组中的行数,排序(表(数据))将它们按顺序排列。取它的累积(),当堆叠在一起的时候,我们就得到了每条棒的边缘,然后乘以0.5,就得到了堆栈中每个条的一半的高度(或者是饼的楔形的一半)。
as.numeric()
simply ensures we have a numeric vector rather than an object of class table
.
数字()只是确保我们有一个数字向量而不是类表的对象。
Subtracting the half-widths from the cumulative heights gives the centers each bar when stacked up. But ggplot will stack the bars with the biggest on the bottom, whereas all our sort()
ing puts the smallest first, so we need to do nrow -
everything because what we've actually calculate are the label positions relative to the top of the bar, not the bottom. (And, with the original disaggregated data, nrow()
is the total number of rows hence the total height of the bar.)
从累积高度减去一半的宽度,就可以在堆叠的时候给这些中心各点。但是ggplot会把最大的条放在底部,而我们所有的排序()都把最小的放在第一位,所以我们需要做nrow——所有的东西,因为我们实际上计算的是标签的位置相对于杆的顶部,而不是底部。(并且,使用原始的分解数据,nrow()是所有行的总数,因此是bar的总高度。)
#2
9
Preface: I did not make pie charts of my own free will.
前言:我没有做自己*意志的饼图。
Here's a modification of the ggpie
function that includes percentages:
这里是对ggpie函数的修改,包括百分比:
library(ggplot2)
library(dplyr)
#
# df$main should contain observations of interest
# df$condition can optionally be used to facet wrap
#
# labels should be a character vector of same length as group_by(df, main) or
# group_by(df, condition, main) if facet wrapping
#
pie_chart <- function(df, main, labels = NULL, condition = NULL) {
# convert the data into percentages. group by conditional variable if needed
df <- group_by_(df, .dots = c(condition, main)) %>%
summarize(counts = n()) %>%
mutate(perc = counts / sum(counts)) %>%
arrange(desc(perc)) %>%
mutate(label_pos = cumsum(perc) - perc / 2,
perc_text = paste0(round(perc * 100), "%"))
# reorder the category factor levels to order the legend
df[[main]] <- factor(df[[main]], levels = unique(df[[main]]))
# if labels haven't been specified, use what's already there
if (is.null(labels)) labels <- as.character(df[[main]])
p <- ggplot(data = df, aes_string(x = factor(1), y = "perc", fill = main)) +
# make stacked bar chart with black border
geom_bar(stat = "identity", color = "black", width = 1) +
# add the percents to the interior of the chart
geom_text(aes(x = 1.25, y = label_pos, label = perc_text), size = 4) +
# add the category labels to the chart
# increase x / play with label strings if labels aren't pretty
geom_text(aes(x = 1.82, y = label_pos, label = labels), size = 4) +
# convert to polar coordinates
coord_polar(theta = "y") +
# formatting
scale_y_continuous(breaks = NULL) +
scale_fill_discrete(name = "", labels = unique(labels)) +
theme(text = element_text(size = 22),
axis.ticks = element_blank(),
axis.text = element_blank(),
axis.title = element_blank())
# facet wrap if that's happening
if (!is.null(condition)) p <- p + facet_wrap(condition)
return(p)
}
Example:
例子:
# sample data
resps <- c("A", "A", "A", "F", "C", "C", "D", "D", "E")
cond <- c(rep("cat A", 5), rep("cat B", 4))
example <- data.frame(resps, cond)
Just like a typical ggplot call:
就像一个典型的ggplot调用:
ex_labs <- c("alpha", "charlie", "delta", "echo", "foxtrot")
pie_chart(example, main = "resps", labels = ex_labs) +
labs(title = "unfacetted example")
ex_labs2 <- c("alpha", "charlie", "foxtrot", "delta", "charlie", "echo")
pie_chart(example, main = "resps", labels = ex_labs2, condition = "cond") +
labs(title = "facetted example")
#3
0
It worked on all included function greatly inspired from here
它对所有的功能都有很大的启发。
ggpie <- function (data)
{
# prepare name
deparse( substitute(data) ) -> name ;
# prepare percents for legend
table( factor(data) ) -> tmp.count1
prop.table( tmp.count1 ) * 100 -> tmp.percent1 ;
paste( tmp.percent1, " %", sep = "" ) -> tmp.percent2 ;
as.vector(tmp.count1) -> tmp.count1 ;
# find breaks for legend
rev( tmp.count1 ) -> tmp.count2 ;
rev( cumsum( tmp.count2 ) - (tmp.count2 / 2) ) -> tmp.breaks1 ;
# prepare data
data.frame( vector1 = tmp.count1, names1 = names(tmp.percent1) ) -> tmp.df1 ;
# plot data
tmp.graph1 <- ggplot(tmp.df1, aes(x = 1, y = vector1, fill = names1 ) ) +
geom_bar(stat = "identity", color = "black" ) +
guides( fill = guide_legend(override.aes = list( colour = NA ) ) ) +
coord_polar( theta = "y" ) +
theme(axis.ticks = element_blank(),
axis.text.y = element_blank(),
axis.text.x = element_text( colour = "black"),
axis.title = element_blank(),
plot.title = element_text( hjust = 0.5, vjust = 0.5) ) +
scale_y_continuous( breaks = tmp.breaks1, labels = tmp.percent2 ) +
ggtitle( name ) +
scale_fill_grey( name = "") ;
return( tmp.graph1 )
} ;
An example :
一个例子:
sample( LETTERS[1:6], 200, replace = TRUE) -> vector1 ;
ggpie(vector1)
输出
#1
9
I've preserved most of your code. I found this pretty easy to debug by leaving out the coord_polar
... easier to see what's going on as a bar graph.
我保存了你的大部分代码。我发现这个很容易调试,省去了coord_polar…更容易看出什么是条形图。
The main thing was to reorder the factor from highest to lowest to get the plotting order correct, then just playing with the label positions to get them right. I also simplified your code for the labels (you don't need the as.character
or the rep
, and paste0
is a shortcut for sep = ""
.)
最主要的是要从最高到最低的顺序重新排序,以使绘图顺序正确,然后只需要在标签位置上进行调整就可以正确地完成。我还简化了标签的代码(您不需要as。字符或代表,而paste0是sep = ""的快捷方式。
League<-c("A","B","A","C","D","E","A","E","D","A","D")
data<-data.frame(League) # I have more variables
data$League <- reorder(data$League, X = data$League, FUN = function(x) -length(x))
at <- nrow(data) - as.numeric(cumsum(sort(table(data)))-0.5*sort(table(data)))
label=paste0(round(sort(table(data))/sum(table(data)),2) * 100,"%")
p <- ggplot(data,aes(x="", fill = League,fill=League)) +
geom_bar(width = 1) +
coord_polar(theta="y") +
annotate(geom = "text", y = at, x = 1, label = label)
p
The at
calculation is finding the centers of the wedges. (It's easier to think of them as the centers of bars in a stacked bar plot, just run the above plot without the coord_polar
line to see.) The at
calculation can be broken out as follows:
在计算时,找到了楔形的中心。(更容易把它们想象成堆叠的条形图中钢筋的中心,只是运行上面的图,没有coord_polar线来查看。)计算结果如下:
table(data)
is the number of rows in each group, and sort(table(data))
puts them in the order they'll be plotted. Taking the cumsum()
of that gives us the edges of each bar when stacked on top of each other, and multiplying by 0.5 gives us the half the heights of each bar in the stack (or half the widths of the wedges of the pie).
表(数据)是每个组中的行数,排序(表(数据))将它们按顺序排列。取它的累积(),当堆叠在一起的时候,我们就得到了每条棒的边缘,然后乘以0.5,就得到了堆栈中每个条的一半的高度(或者是饼的楔形的一半)。
as.numeric()
simply ensures we have a numeric vector rather than an object of class table
.
数字()只是确保我们有一个数字向量而不是类表的对象。
Subtracting the half-widths from the cumulative heights gives the centers each bar when stacked up. But ggplot will stack the bars with the biggest on the bottom, whereas all our sort()
ing puts the smallest first, so we need to do nrow -
everything because what we've actually calculate are the label positions relative to the top of the bar, not the bottom. (And, with the original disaggregated data, nrow()
is the total number of rows hence the total height of the bar.)
从累积高度减去一半的宽度,就可以在堆叠的时候给这些中心各点。但是ggplot会把最大的条放在底部,而我们所有的排序()都把最小的放在第一位,所以我们需要做nrow——所有的东西,因为我们实际上计算的是标签的位置相对于杆的顶部,而不是底部。(并且,使用原始的分解数据,nrow()是所有行的总数,因此是bar的总高度。)
#2
9
Preface: I did not make pie charts of my own free will.
前言:我没有做自己*意志的饼图。
Here's a modification of the ggpie
function that includes percentages:
这里是对ggpie函数的修改,包括百分比:
library(ggplot2)
library(dplyr)
#
# df$main should contain observations of interest
# df$condition can optionally be used to facet wrap
#
# labels should be a character vector of same length as group_by(df, main) or
# group_by(df, condition, main) if facet wrapping
#
pie_chart <- function(df, main, labels = NULL, condition = NULL) {
# convert the data into percentages. group by conditional variable if needed
df <- group_by_(df, .dots = c(condition, main)) %>%
summarize(counts = n()) %>%
mutate(perc = counts / sum(counts)) %>%
arrange(desc(perc)) %>%
mutate(label_pos = cumsum(perc) - perc / 2,
perc_text = paste0(round(perc * 100), "%"))
# reorder the category factor levels to order the legend
df[[main]] <- factor(df[[main]], levels = unique(df[[main]]))
# if labels haven't been specified, use what's already there
if (is.null(labels)) labels <- as.character(df[[main]])
p <- ggplot(data = df, aes_string(x = factor(1), y = "perc", fill = main)) +
# make stacked bar chart with black border
geom_bar(stat = "identity", color = "black", width = 1) +
# add the percents to the interior of the chart
geom_text(aes(x = 1.25, y = label_pos, label = perc_text), size = 4) +
# add the category labels to the chart
# increase x / play with label strings if labels aren't pretty
geom_text(aes(x = 1.82, y = label_pos, label = labels), size = 4) +
# convert to polar coordinates
coord_polar(theta = "y") +
# formatting
scale_y_continuous(breaks = NULL) +
scale_fill_discrete(name = "", labels = unique(labels)) +
theme(text = element_text(size = 22),
axis.ticks = element_blank(),
axis.text = element_blank(),
axis.title = element_blank())
# facet wrap if that's happening
if (!is.null(condition)) p <- p + facet_wrap(condition)
return(p)
}
Example:
例子:
# sample data
resps <- c("A", "A", "A", "F", "C", "C", "D", "D", "E")
cond <- c(rep("cat A", 5), rep("cat B", 4))
example <- data.frame(resps, cond)
Just like a typical ggplot call:
就像一个典型的ggplot调用:
ex_labs <- c("alpha", "charlie", "delta", "echo", "foxtrot")
pie_chart(example, main = "resps", labels = ex_labs) +
labs(title = "unfacetted example")
ex_labs2 <- c("alpha", "charlie", "foxtrot", "delta", "charlie", "echo")
pie_chart(example, main = "resps", labels = ex_labs2, condition = "cond") +
labs(title = "facetted example")
#3
0
It worked on all included function greatly inspired from here
它对所有的功能都有很大的启发。
ggpie <- function (data)
{
# prepare name
deparse( substitute(data) ) -> name ;
# prepare percents for legend
table( factor(data) ) -> tmp.count1
prop.table( tmp.count1 ) * 100 -> tmp.percent1 ;
paste( tmp.percent1, " %", sep = "" ) -> tmp.percent2 ;
as.vector(tmp.count1) -> tmp.count1 ;
# find breaks for legend
rev( tmp.count1 ) -> tmp.count2 ;
rev( cumsum( tmp.count2 ) - (tmp.count2 / 2) ) -> tmp.breaks1 ;
# prepare data
data.frame( vector1 = tmp.count1, names1 = names(tmp.percent1) ) -> tmp.df1 ;
# plot data
tmp.graph1 <- ggplot(tmp.df1, aes(x = 1, y = vector1, fill = names1 ) ) +
geom_bar(stat = "identity", color = "black" ) +
guides( fill = guide_legend(override.aes = list( colour = NA ) ) ) +
coord_polar( theta = "y" ) +
theme(axis.ticks = element_blank(),
axis.text.y = element_blank(),
axis.text.x = element_text( colour = "black"),
axis.title = element_blank(),
plot.title = element_text( hjust = 0.5, vjust = 0.5) ) +
scale_y_continuous( breaks = tmp.breaks1, labels = tmp.percent2 ) +
ggtitle( name ) +
scale_fill_grey( name = "") ;
return( tmp.graph1 )
} ;
An example :
一个例子:
sample( LETTERS[1:6], 200, replace = TRUE) -> vector1 ;
ggpie(vector1)
输出