用一系列不一致的数据控制ggplot2图中的列宽度

时间:2022-06-15 15:02:15

In the artificial data I have created for the MWE below I have tried to demonstrate the essence of a script I have created in R. As can be seen by the graph that gets produced from this code, on one of my conditions I don't have a "No" value to complete the series.

人工数据我已经创建了兆瓦以下我试图展示一个脚本的本质我创造了在r .由图,我们可以看到从这段代码中,我的一个条件我没有“不”价值完成系列。

I have been told that unless I can make this last column that sadly doesn't have the extra series as thin as the columns else where in the graph I won't be permitted to use these graphs. This is sadly a problem because the script I have written produces hundreds of graphs simultaneously, complete with stats, significance indicators, propogated error bars, and intelligent y-axis adjustments (these features are of course not present in the MWE).

我已经被告知,除非我能让这最后一列很遗憾地没有额外的系列像其他的列一样,在图中我将不允许使用这些图。不幸的是,这是一个问题,因为我编写的脚本同时生成了数百个图形,包括统计数据、显著性指标、建议的错误条和智能y轴调整(这些特性当然不在MWE中)。

Few other comments:

其他一些评论:

  • This exception column is not guaranteed to be at the end of the graph... so manual tweaking to force the series to change color and invert the order leaving the extra space on the right hand side isn't reliable.

    这个异常列不能保证在图的末尾……因此手动调整以迫使系列改变颜色和颠倒顺序,留下额外的空间在右边是不可靠的。

  • I have tried to simulate the data as a constant 0 so that the series "is present" but invisible, but as would be expected, the order of the series c(No,Yes) makes this skip a space which is also unacceptable. This is how this same question was answered here, but sadly it doesn't work for me with my restrictions: Consistent width for geom_bar in the event of missing data and Include space for missing factor level used in fill aesthetics in geom_boxplot

    我尝试将数据模拟为一个常数0,使级数“存在”但不可见,但正如预期的那样,级数c(No,Yes)的顺序使它跳过了一个同样不可接受的空间。这就是这里同样的问题的答案,但遗憾的是,它对我的限制不起作用:在丢失数据的情况下,与geom_bar保持一致的宽度,并在geom_boxplot中包含填充美学中使用的缺失因子级别的空间

  • I also tried to do this with facets but numerous issues arose there including line breaks, and errors in the annotations I add to the x-axis.

    我还尝试使用facet来实现这一点,但是出现了很多问题,包括换行,以及添加到x轴的注解中的错误。

MWE:

兆瓦:

library(ggplot2)

print("Program started")

x <- c("1","2","3","1","2","3","4")
s <- c("No","No","No","Yes","Yes","Yes","Yes")
y <- c(1,2,3,2,3,4,5)
df <- as.data.frame(cbind(x,s,y))

print(df)

gg <- ggplot(data = df, aes_string(x="x", y="y", weight="y", ymin=paste0("y"), ymax=paste0("y"), fill="s"));
dodge_str <- position_dodge(width = NULL, height = NULL);
gg <- gg + geom_bar(position=dodge_str, stat="identity", size=.3, colour = "black")

print(gg)

print("Program complete - a graph should be visible.")

2 个解决方案

#1


1  

At the expense of doing your own calculation for the x coordinates of the bars as shown below, you can get a chart which may be close to what you're looking for.

以牺牲你自己计算的x坐标为代价,如下面所示,你可以得到一个图表,它可能接近你要找的。

x <- c("1","2","3","1","2","3","4")
s <- c("No","No","No","Yes","Yes","Yes","Yes")
y <- c(1,2,3,2,3,4,5)
df <- data.frame(cbind(x,s,y) )
df$x_pos[order(df$x, df$s)] <- 1:nrow(df)
x_stats <- as.data.frame.table(table(df$x), responseName="x_counts")
x_stats$center <- tapply(df$x_pos, df$x, mean)
df <-  merge(df, x_stats, by.x="x", by.y="Var1", all=TRUE)
bar_width <- .7
df$pos <- apply(df, 1, function(x) {xpos=as.numeric(x[4]) 
                                if(x[5] == 1) xpos 
                                else ifelse(x[2]=="No", xpos + .5 -        bar_width/2, xpos - .5 + bar_width/2) } )
 print(df)
gg <- ggplot(data=df, aes(x=pos, y=y, fill=s ) )
gg <- gg + geom_bar(position="identity", stat="identity", size=.3,    colour="black", width=bar_width)
gg <- gg + scale_x_continuous(breaks=df$center,labels=df$x )
plot(gg)

----- edit --------------------------------------------------

- - - - - -编辑- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Modified to place the labels at the center of bars.

修改为将标签放置在条的中心。

Gives the following chart

给下面的图表

用一系列不一致的数据控制ggplot2图中的列宽度

#2


1  

Yeah, I figured what happened: you need to be extra careful about factors being factors and numerics being numerics. In my case, with stringsAsFactors = FALSE I have

是的,我知道发生了什么:你需要特别注意因子和数字是数字。在我的例子中,stringsAsFactors = FALSE

str(df)
'data.frame':   7 obs. of  3 variables:
 $ x: chr  "1" "2" "3" "1" ...
 $ s: chr  "No" "No" "No" "Yes" ...
 $ y: chr  "1" "2" "3" "2" ...

dput(df)
structure(list(x = c("1", "2", "3", "1", "2", "3", "4"), s = c("No", 
"No", "No", "Yes", "Yes", "Yes", "Yes"), y = c("1", "2", "3", 
"2", "3", "4", "5")), .Names = c("x", "s", "y"), row.names = c(NA, 
-7L), class = "data.frame")

with no factors and numeric turned into character because of cbind-ing (sic!). Let us have another data frame:

由于cbind-ing (sic),没有任何因素和数值转换成字符。让我们有另一个数据框架:

dff <- data.frame(x = factor(df$x), s = factor(df$s), y = as.numeric(df$y))

Adding a "dummy" row (manually for your example, check out expand.grid version in the linked question on how to do this automatically):

添加一个“哑”行(对于您的示例,手动地检查展开。网格版本中关于如何自动做这个的链接问题):

dff <- rbind(dff, c(4, "No", NA))

Plotting (I removed extra aes):

绘图(我移除了额外的aes):

ggplot(data = df3, aes(x, y, fill=s)) + 
  geom_bar(position=dodge_str, stat="identity", size=.3, colour="black")

用一系列不一致的数据控制ggplot2图中的列宽度

#1


1  

At the expense of doing your own calculation for the x coordinates of the bars as shown below, you can get a chart which may be close to what you're looking for.

以牺牲你自己计算的x坐标为代价,如下面所示,你可以得到一个图表,它可能接近你要找的。

x <- c("1","2","3","1","2","3","4")
s <- c("No","No","No","Yes","Yes","Yes","Yes")
y <- c(1,2,3,2,3,4,5)
df <- data.frame(cbind(x,s,y) )
df$x_pos[order(df$x, df$s)] <- 1:nrow(df)
x_stats <- as.data.frame.table(table(df$x), responseName="x_counts")
x_stats$center <- tapply(df$x_pos, df$x, mean)
df <-  merge(df, x_stats, by.x="x", by.y="Var1", all=TRUE)
bar_width <- .7
df$pos <- apply(df, 1, function(x) {xpos=as.numeric(x[4]) 
                                if(x[5] == 1) xpos 
                                else ifelse(x[2]=="No", xpos + .5 -        bar_width/2, xpos - .5 + bar_width/2) } )
 print(df)
gg <- ggplot(data=df, aes(x=pos, y=y, fill=s ) )
gg <- gg + geom_bar(position="identity", stat="identity", size=.3,    colour="black", width=bar_width)
gg <- gg + scale_x_continuous(breaks=df$center,labels=df$x )
plot(gg)

----- edit --------------------------------------------------

- - - - - -编辑- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Modified to place the labels at the center of bars.

修改为将标签放置在条的中心。

Gives the following chart

给下面的图表

用一系列不一致的数据控制ggplot2图中的列宽度

#2


1  

Yeah, I figured what happened: you need to be extra careful about factors being factors and numerics being numerics. In my case, with stringsAsFactors = FALSE I have

是的,我知道发生了什么:你需要特别注意因子和数字是数字。在我的例子中,stringsAsFactors = FALSE

str(df)
'data.frame':   7 obs. of  3 variables:
 $ x: chr  "1" "2" "3" "1" ...
 $ s: chr  "No" "No" "No" "Yes" ...
 $ y: chr  "1" "2" "3" "2" ...

dput(df)
structure(list(x = c("1", "2", "3", "1", "2", "3", "4"), s = c("No", 
"No", "No", "Yes", "Yes", "Yes", "Yes"), y = c("1", "2", "3", 
"2", "3", "4", "5")), .Names = c("x", "s", "y"), row.names = c(NA, 
-7L), class = "data.frame")

with no factors and numeric turned into character because of cbind-ing (sic!). Let us have another data frame:

由于cbind-ing (sic),没有任何因素和数值转换成字符。让我们有另一个数据框架:

dff <- data.frame(x = factor(df$x), s = factor(df$s), y = as.numeric(df$y))

Adding a "dummy" row (manually for your example, check out expand.grid version in the linked question on how to do this automatically):

添加一个“哑”行(对于您的示例,手动地检查展开。网格版本中关于如何自动做这个的链接问题):

dff <- rbind(dff, c(4, "No", NA))

Plotting (I removed extra aes):

绘图(我移除了额外的aes):

ggplot(data = df3, aes(x, y, fill=s)) + 
  geom_bar(position=dodge_str, stat="identity", size=.3, colour="black")

用一系列不一致的数据控制ggplot2图中的列宽度