I am trying to create this type of chart from the data on the left (arbitrary values for simplicity):
我试图从左边的数据创建这种类型的图表(为简单起见,任意值):
The goal is to plot variable X on the x-axis with the mean on the Y-axis and error bars equal to the standard error se.
目标是在x轴上绘制变量X,在Y轴上绘制平均值,误差线等于标准误差se。
The problem is that values 1-10 should be each be represented individually (blue curve), and that the values for A and B should be plotted on each of the 1-10 values (green and red line).
问题是值1-10应各自单独表示(蓝色曲线),并且A和B的值应绘制在1-10个值(绿色和红色线)中的每一个上。
I can draw the curve if I manually save the data and manually copy the values for A and B to each value for X but this is not very time efficient. Is there a more elegant way to do this?
如果我手动保存数据并手动将A和B的值复制到X的每个值,我可以绘制曲线,但这不是非常节省时间。有没有更优雅的方式来做到这一点?
Thanks in advance!
提前致谢!
EDIT: As suggested the code:
编辑:建议代码:
df <- structure(list(X = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 2L, 11L, 12L), .Label = c("1", "10", "2", "3", "4", "5",
"6", "7", "8", "9", "A", "B"), class = "factor"), mean = c(1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 5.5, 6.5), sd = c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), se = c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("X", "mean", "sd", "se"), class = "data.frame", row.names = c(NA,-12L))
df<-as.data.frame(df)
df$X<-factor(df$X)
plot <- ggplot(df, aes(x=df$X, y=df$mean)) + geom_point() + geom_errorbar(aes(ymin=mean-se, ymax=mean+se), width=.1)
plot
3 个解决方案
#1
1
Im afraid I don't know ggplot, but hopefully this is what you want (it might also aid others in understanding your question).
我害怕我不知道ggplot,但希望这是你想要的(它也可能有助于其他人理解你的问题)。
You want a ggplot with three lines, 1. df$X,df$mean 2. df$X,df$row_A_mean 3. df$X,df$row_B_mean 4. error bars of the SE column
你想要一个有三行的ggplot,1。df $ X,df $ mean 2. df $ X,df $ row_A_mean 3. df $ X,df $ row_B_mean 4. SE列的误差条
df <- structure(list(X = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 2L, 11L, 12L), .Label = c("1", "10", "2", "3", "4", "5",
"6", "7", "8", "9", "A", "B"), class = "factor"), mean = c(1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 5.5, 6.5), sd = c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), se = c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("X", "mean", "sd", "se"), class = "data.frame", row.names = c(NA,-12L))
df<-as.data.frame(df)
df$X<-factor(df$X)
plot <- ggplot(df, aes(x=df$X, y=df$mean)) + geom_point() + geom_errorbar(aes(ymin=mean-se, ymax=mean+se), width=.1)
plot
#row A mean
df$row_A_mean<-rep(df[11,]$mean,nrow(df))# note that this could also be replaces by a horizontal line, unless the mean changes
#row A sd
df$row_A_sd<-rep(df[11,]$sd,nrow(df))
plot(as.numeric(df$X),df$mean,type="p",col="red")
lines(as.numeric(df$X),df$row_A_mean,col="green")
#2
1
If we use a subset to define the data
elements of the ggplot
, we can come up with one solution using geom_hline
:
如果我们使用子集来定义ggplot的数据元素,我们可以使用geom_hline提出一个解决方案:
theme_set(theme_bw())
ggplot(data = df[1:10,])+
geom_errorbar(aes(x = X, ymin = mean - se, ymax = mean + se))+
geom_point(aes(x = X, y = mean))+
geom_line(aes(x = X, y = mean), group = 1)+
geom_hline(data = df[11,], aes(yintercept = mean, colour = 'A'))+
geom_hline(data = df[12,], aes(yintercept = mean, colour = 'B'))
#3
1
It's helpful to reorient your data into long form so that you can really utilize the aesthetic
part of ggplot. Generally I would use reshape2::melt
for this, but your data the way it's currently formatted doesn't really lend itself to it. I'll show you what I mean by long form and you can get the idea what we're shooting for:
将数据重新定向为长格式是有帮助的,这样您就可以真正利用ggplot的美学部分。一般来说,我会使用reshape2 :: melt,但是你的数据目前的格式并没有真正适合它。我会告诉你我的意思是长形式你可以得到我们正在拍摄的想法:
#setting variables for your classes so it's a bit more scalable - reset as applicable
x.seriesLength <- 10
x.class.name <- "X" #name of the main series class; X in your example
a.vec <- c(5.5, 1, 1, "A")
b.vec <- c(6.5, 1, 1, "B")
#trimming df so we can reshape
df <- df[1:x.seriesLength, 2:4]
df$class <- x.class.name #adding class column
#converting your static A and B values to long form, sending to a data.frame and adding to df
add <- matrix(c(rep(a.vec, times = x.seriesLength),
rep(b.vec, times = x.seriesLength)),
byrow = T,
ncol = 4)
colnames(add) <- c("mean", "sd", "se", "class")
df <- rbind(df, add)
print(df)
Then we need to do a bit more cleaning:
然后我们需要做更多的清洁工作:
df$rownum <- rep(1:x.seriesLength, times = 3)
df[,1:3] <- sapply(df[,1:3], as.numeric) #casting as numeric
df$barmin <- df$mean - df$sd
df$barmax <- df$mean + df$sd
Now we have a long form data frame with the required data. We can then use the new class
column to plot and color multiple series.
现在我们有一个包含所需数据的长格式数据框。然后我们可以使用新的类列来绘制和着色多个系列。
#use class column to tell ggplot which points belong to which series
g <- ggplot(data = df) +
geom_point(aes(x = rownum, y = mean, color = class)) +
geom_errorbar(aes(x = rownum, ymin=barmin, ymax=barmax, color = class), width=.1)
g
Edit: If you want lines instead of points, just replace geom_point
with geom_line
.
编辑:如果你想要线而不是点,只需用geom_line替换geom_point。
#1
1
Im afraid I don't know ggplot, but hopefully this is what you want (it might also aid others in understanding your question).
我害怕我不知道ggplot,但希望这是你想要的(它也可能有助于其他人理解你的问题)。
You want a ggplot with three lines, 1. df$X,df$mean 2. df$X,df$row_A_mean 3. df$X,df$row_B_mean 4. error bars of the SE column
你想要一个有三行的ggplot,1。df $ X,df $ mean 2. df $ X,df $ row_A_mean 3. df $ X,df $ row_B_mean 4. SE列的误差条
df <- structure(list(X = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 2L, 11L, 12L), .Label = c("1", "10", "2", "3", "4", "5",
"6", "7", "8", "9", "A", "B"), class = "factor"), mean = c(1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 5.5, 6.5), sd = c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), se = c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("X", "mean", "sd", "se"), class = "data.frame", row.names = c(NA,-12L))
df<-as.data.frame(df)
df$X<-factor(df$X)
plot <- ggplot(df, aes(x=df$X, y=df$mean)) + geom_point() + geom_errorbar(aes(ymin=mean-se, ymax=mean+se), width=.1)
plot
#row A mean
df$row_A_mean<-rep(df[11,]$mean,nrow(df))# note that this could also be replaces by a horizontal line, unless the mean changes
#row A sd
df$row_A_sd<-rep(df[11,]$sd,nrow(df))
plot(as.numeric(df$X),df$mean,type="p",col="red")
lines(as.numeric(df$X),df$row_A_mean,col="green")
#2
1
If we use a subset to define the data
elements of the ggplot
, we can come up with one solution using geom_hline
:
如果我们使用子集来定义ggplot的数据元素,我们可以使用geom_hline提出一个解决方案:
theme_set(theme_bw())
ggplot(data = df[1:10,])+
geom_errorbar(aes(x = X, ymin = mean - se, ymax = mean + se))+
geom_point(aes(x = X, y = mean))+
geom_line(aes(x = X, y = mean), group = 1)+
geom_hline(data = df[11,], aes(yintercept = mean, colour = 'A'))+
geom_hline(data = df[12,], aes(yintercept = mean, colour = 'B'))
#3
1
It's helpful to reorient your data into long form so that you can really utilize the aesthetic
part of ggplot. Generally I would use reshape2::melt
for this, but your data the way it's currently formatted doesn't really lend itself to it. I'll show you what I mean by long form and you can get the idea what we're shooting for:
将数据重新定向为长格式是有帮助的,这样您就可以真正利用ggplot的美学部分。一般来说,我会使用reshape2 :: melt,但是你的数据目前的格式并没有真正适合它。我会告诉你我的意思是长形式你可以得到我们正在拍摄的想法:
#setting variables for your classes so it's a bit more scalable - reset as applicable
x.seriesLength <- 10
x.class.name <- "X" #name of the main series class; X in your example
a.vec <- c(5.5, 1, 1, "A")
b.vec <- c(6.5, 1, 1, "B")
#trimming df so we can reshape
df <- df[1:x.seriesLength, 2:4]
df$class <- x.class.name #adding class column
#converting your static A and B values to long form, sending to a data.frame and adding to df
add <- matrix(c(rep(a.vec, times = x.seriesLength),
rep(b.vec, times = x.seriesLength)),
byrow = T,
ncol = 4)
colnames(add) <- c("mean", "sd", "se", "class")
df <- rbind(df, add)
print(df)
Then we need to do a bit more cleaning:
然后我们需要做更多的清洁工作:
df$rownum <- rep(1:x.seriesLength, times = 3)
df[,1:3] <- sapply(df[,1:3], as.numeric) #casting as numeric
df$barmin <- df$mean - df$sd
df$barmax <- df$mean + df$sd
Now we have a long form data frame with the required data. We can then use the new class
column to plot and color multiple series.
现在我们有一个包含所需数据的长格式数据框。然后我们可以使用新的类列来绘制和着色多个系列。
#use class column to tell ggplot which points belong to which series
g <- ggplot(data = df) +
geom_point(aes(x = rownum, y = mean, color = class)) +
geom_errorbar(aes(x = rownum, ymin=barmin, ymax=barmax, color = class), width=.1)
g
Edit: If you want lines instead of points, just replace geom_point
with geom_line
.
编辑:如果你想要线而不是点,只需用geom_line替换geom_point。