如何将一个带有多个变量的条形图并排分组

时间:2021-09-04 12:05:36

I have a dataset which looks like this one below. I am trying to make a barplot with the grouping variable gender, with all the variables side by side on the x axis (grouped by gender as filler with different colors), and mean values of variables on the y axis (which basically represents percentages)

我有一个数据集,如下所示。我试图制作一个带有分组变量性别的条形图,所有变量在x轴上并排(按性别分组为不同颜色的填充),y轴上的变量平均值(基本上代表百分比)

tea                coke            beer             water           gender
14.55              26.50793651     22.53968254      40              1
24.92997199        24.50980392     26.05042017      24.50980393     2
23.03732304        30.63063063     25.41827542      20.91377091     1   
225.51781276       24.6064623      24.85501243      50.80645161     1
24.53662842        26.03706973     25.24271845      24.18358341     2   

In the end I want to get a barplot like this 如何将一个带有多个变量的条形图并排分组

最后我想得到一个像这样的条形图

any suggestions how to do that? I made some searches but I only find examples for factors on the x axis, not variables grouped by a factor. any help will be appreciated!

任何建议如何做到这一点?我做了一些搜索,但我只找到x轴上的因子的例子,而不是按因子分组的变量。任何帮助将不胜感激!

3 个解决方案

#1


33  

You can use aggregate to calculate the means:

您可以使用聚合来计算均值:

means<-aggregate(df,by=list(df$gender),mean)
Group.1      tea     coke     beer    water gender
1       1 87.70171 27.24834 24.27099 37.24007      1
2       2 24.73330 25.27344 25.64657 24.34669      2

Get rid of the Group.1 column

摆脱Group.1列

means<-means[,2:length(means)]

Then you have reformat the data to be in long format:

然后,您将数据重新格式化为长格式:

library(reshape2)
means.long<-melt(means,id.vars="gender")
  gender variable    value
1      1      tea 87.70171
2      2      tea 24.73330
3      1     coke 27.24834
4      2     coke 25.27344
5      1     beer 24.27099
6      2     beer 25.64657
7      1    water 37.24007
8      2    water 24.34669

Finally, you can use ggplot2 to create your plot:

最后,您可以使用ggplot2创建您的情节:

library(ggplot2)
ggplot(means.long,aes(x=variable,y=value,fill=factor(gender)))+
  geom_bar(stat="identity",position="dodge")+
  scale_fill_discrete(name="Gender",
                      breaks=c(1, 2),
                      labels=c("Male", "Female"))+
  xlab("Beverage")+ylab("Mean Percentage")

如何将一个带有多个变量的条形图并排分组

#2


8  

You can plot the means without resorting to external calculations and additional tables using stat_summary(...). In fact, stat_summary(...) was designed for exactly what you are doing.

您可以使用stat_summary(...)绘制平均值而无需借助外部计算和其他表格。事实上,stat_summary(...)的设计正是为了你正在做的事情。

library(ggplot2)
library(reshape2)            # for melt(...)
gg <- melt(df,id="gender")   # df is your original table
ggplot(gg, aes(x=variable, y=value, fill=factor(gender))) + 
  stat_summary(fun.y=mean, geom="bar",position=position_dodge(1)) + 
  scale_color_discrete("Gender")
  stat_summary(fun.ymin=min,fun.ymax=max,geom="errorbar",
               color="grey80",position=position_dodge(1), width=.2)

如何将一个带有多个变量的条形图并排分组

To add "error bars" you cna also use stat_summary(...) (here, I'm using the min and max value rather than sd because you have so little data).

要添加“错误栏”,你还可以使用stat_summary(...)(这里,我使用的是最小值和最大值而不是sd,因为你的数据很少)。

ggplot(gg, aes(x=variable, y=value, fill=factor(gender))) + 
  stat_summary(fun.y=mean, geom="bar",position=position_dodge(1)) + 
  stat_summary(fun.ymin=min,fun.ymax=max,geom="errorbar",
               color="grey40",position=position_dodge(1), width=.2) +
  scale_fill_discrete("Gender")

如何将一个带有多个变量的条形图并排分组

#3


3  

Using reshape2 and dplyr. Your data:

使用reshape2和dplyr。你的数据:

df <- read.table(text=
"tea                coke            beer             water           gender
14.55              26.50793651     22.53968254      40              1
24.92997199        24.50980392     26.05042017      24.50980393     2
23.03732304        30.63063063     25.41827542      20.91377091     1   
225.51781276       24.6064623      24.85501243      50.80645161     1
24.53662842        26.03706973     25.24271845      24.18358341     2", header=TRUE)

Getting data into correct form:

将数据转换为正确的形式:

library(reshape2)
library(dplyr)
df.melt <- melt(df, id="gender")
bar <- group_by(df.melt, variable, gender)%.%summarise(mean=mean(value))

Plotting:

绘图:

library(ggplot2)
ggplot(bar, aes(x=variable, y=mean, fill=factor(gender)))+
  geom_bar(position="dodge", stat="identity")

如何将一个带有多个变量的条形图并排分组

#1


33  

You can use aggregate to calculate the means:

您可以使用聚合来计算均值:

means<-aggregate(df,by=list(df$gender),mean)
Group.1      tea     coke     beer    water gender
1       1 87.70171 27.24834 24.27099 37.24007      1
2       2 24.73330 25.27344 25.64657 24.34669      2

Get rid of the Group.1 column

摆脱Group.1列

means<-means[,2:length(means)]

Then you have reformat the data to be in long format:

然后,您将数据重新格式化为长格式:

library(reshape2)
means.long<-melt(means,id.vars="gender")
  gender variable    value
1      1      tea 87.70171
2      2      tea 24.73330
3      1     coke 27.24834
4      2     coke 25.27344
5      1     beer 24.27099
6      2     beer 25.64657
7      1    water 37.24007
8      2    water 24.34669

Finally, you can use ggplot2 to create your plot:

最后,您可以使用ggplot2创建您的情节:

library(ggplot2)
ggplot(means.long,aes(x=variable,y=value,fill=factor(gender)))+
  geom_bar(stat="identity",position="dodge")+
  scale_fill_discrete(name="Gender",
                      breaks=c(1, 2),
                      labels=c("Male", "Female"))+
  xlab("Beverage")+ylab("Mean Percentage")

如何将一个带有多个变量的条形图并排分组

#2


8  

You can plot the means without resorting to external calculations and additional tables using stat_summary(...). In fact, stat_summary(...) was designed for exactly what you are doing.

您可以使用stat_summary(...)绘制平均值而无需借助外部计算和其他表格。事实上,stat_summary(...)的设计正是为了你正在做的事情。

library(ggplot2)
library(reshape2)            # for melt(...)
gg <- melt(df,id="gender")   # df is your original table
ggplot(gg, aes(x=variable, y=value, fill=factor(gender))) + 
  stat_summary(fun.y=mean, geom="bar",position=position_dodge(1)) + 
  scale_color_discrete("Gender")
  stat_summary(fun.ymin=min,fun.ymax=max,geom="errorbar",
               color="grey80",position=position_dodge(1), width=.2)

如何将一个带有多个变量的条形图并排分组

To add "error bars" you cna also use stat_summary(...) (here, I'm using the min and max value rather than sd because you have so little data).

要添加“错误栏”,你还可以使用stat_summary(...)(这里,我使用的是最小值和最大值而不是sd,因为你的数据很少)。

ggplot(gg, aes(x=variable, y=value, fill=factor(gender))) + 
  stat_summary(fun.y=mean, geom="bar",position=position_dodge(1)) + 
  stat_summary(fun.ymin=min,fun.ymax=max,geom="errorbar",
               color="grey40",position=position_dodge(1), width=.2) +
  scale_fill_discrete("Gender")

如何将一个带有多个变量的条形图并排分组

#3


3  

Using reshape2 and dplyr. Your data:

使用reshape2和dplyr。你的数据:

df <- read.table(text=
"tea                coke            beer             water           gender
14.55              26.50793651     22.53968254      40              1
24.92997199        24.50980392     26.05042017      24.50980393     2
23.03732304        30.63063063     25.41827542      20.91377091     1   
225.51781276       24.6064623      24.85501243      50.80645161     1
24.53662842        26.03706973     25.24271845      24.18358341     2", header=TRUE)

Getting data into correct form:

将数据转换为正确的形式:

library(reshape2)
library(dplyr)
df.melt <- melt(df, id="gender")
bar <- group_by(df.melt, variable, gender)%.%summarise(mean=mean(value))

Plotting:

绘图:

library(ggplot2)
ggplot(bar, aes(x=variable, y=mean, fill=factor(gender)))+
  geom_bar(position="dodge", stat="identity")

如何将一个带有多个变量的条形图并排分组