如何使ggplot2在数据子集上保持未使用的级别

时间:2021-11-29 14:54:41

My problem is clearly not new, but I haven't been able to find my exact coding question answered. I am working from a subset of my data (available here) and have been trying all possible combinations of scale_x_discrete(drop=FALSE) and scale_fill_discrete(drop=FALSE) to try to get ggplot2 to include a space where the bar would be for Chipmunks (n=0 for event "CF" - n.b. this corresponds to the variable "forage" in the data).

我的问题显然不是新问题,但我无法找到我的确切编码问题。我正在从我的数据的一个子集(这里可用)开始工作,并尝试了scale_x_discrete(drop = FALSE)和scale_fill_discrete(drop = FALSE)的所有可能组合,试图让ggplot2包含一个空格,其中条形图将用于花栗鼠(n = 0表示事件“CF” - nb这对应于数据中的变量“forage”)。

The code I am using is as follows:

我使用的代码如下:

require(ggplot2)
library(ggthemes)

#excluding MICROs from my plot
ggplot(data[data$sps=="MAMO" | data$sps=="TAST" | data$sps=="MUVI"|    data$sps=="MUXX" | data$sps=="TAHU",], 
      aes(sps, fill=forage))+geom_bar(position="dodge") +
    labs(x = "Species", y = "Number of observations") +
    scale_x_discrete(labels = c("Marmot","American Mink", "Weasel Spp.", "Red squirrel", "Chipmunk")) +
    theme_classic() + 
    scale_fill_manual(values = c("#000000", "#666666", "#999999","#CCCCCC"), name = "Event")

I then get a plot like this one: 如何使ggplot2在数据子集上保持未使用的级别

然后我得到一个像这样的情节:

When I add scale_x_discrete(drop = FALSE) I get this:如何使ggplot2在数据子集上保持未使用的级别 What the code appears to be doing is including my previously excluded MICRO data (hence everything gets shifted over one after Marmots and Chipmunks still only have 3 bars).

当我添加scale_x_discrete(drop = FALSE)时,我得到:代码似乎正在做的是包括我之前排除的MICRO数据(因此在土拨鼠和花栗鼠仍然只有3个柱之后,所有内容都会移动一个)。

When I try scale_fill_discrete(drop = FALSE) the resulting plot doesn't change at all from the first plot presented. When I try both scale_x_discrete(drop = FALSE) and scale_fill_discrete(drop = FALSE) the plot looks like the second plot presented.

当我尝试使用scale_fill_discrete(drop = FALSE)时,得到的绘图在所呈现的第一个绘图中根本不会改变。当我尝试scale_x_discrete(drop = FALSE)和scale_fill_discrete(drop = FALSE)时,该图看起来像第二个图。

I figure I can manually go and make a small table with the frequencies for each level (Event), but I would like to first try to code it properly in R.

我想我可以手动去制作一个包含每个级别(事件)频率的小表,但我想首先尝试在R中正确编码。

Does anyone have any suggestions for what I could add/change in my code to do this?

有没有人对我在代码中添加/更改的内容有任何建议?

Update: I tried the code suggested below:

更新:我尝试了下面建议的代码:

df1 %>% 
  filter(sps != "MICRO") %>% 
  group_by(sps) %>% 
  count(forage) %>% 
  ungroup %>% 
  complete(sps, forage, fill = list(n = 0)) %>% 
ggplot(aes(sps, n)) + geom_col(aes(fill = forage), position = "dodge") +
  scale_x_discrete(labels=c("Marmot","American Mink", "Weasel Spp.", "Red squirrel", "Chipmunk")) + 
  theme_classic() + 
  scale_fill_manual(values=c("#000000", "#666666", "#999999","#CCCCCC"), name = "Event") + 
  labs(x = "Species", y = "Number of observations")

The resulting plot has the space (yay!) but still has an empty space for where MICRO would be:

得到的图有空格(yay!),但仍然有一个空格,用于MICRO所在的位置:

如何使ggplot2在数据子集上保持未使用的级别

1 个解决方案

#1


2  

The issue here is that a count of zero is not generated for sps = TAST, forage = CF. You can create that count using tidyr::complete. I've also added some dplyr functions to make the code cleaner. Assuming that your data frame is named df1 (as opposed to data, which is a base function name so not a good choice):

这里的问题是没有为sps = TAST,forage = CF生成零计数。您可以使用tidyr :: complete创建该计数。我还添加了一些dplyr函数来使代码更清晰。假设您的数据框名为df1(而不是数据,这是一个基本函数名称,因此不是一个好选择):

UPDATED: with stringsAsFactors = FALSE to address issues in comments.

更新:使用stringsAsFactors = FALSE来解决评论中的问题。

library(dplyr)
library(tidyr)
library(ggplot2)

df1 <- read.table("data.txt", header = TRUE, stringsAsFactors = FALSE)
df1 %>% 
  filter(sps != "MICRO") %>% 
  group_by(sps) %>% 
  count(forage) %>% 
  ungroup %>% 
  complete(sps, forage, fill = list(n = 0)) %>% 
  ggplot(aes(sps, n)) + geom_col(aes(fill = forage), position = "dodge") +
    scale_x_discrete(labels=c("Marmot","American Mink", "Weasel Spp.", "Red squirrel", "Chipmunk")) + 
    theme_classic() + 
    scale_fill_manual(values=c("#000000", "#666666", "#999999","#CCCCCC"), name = "Event") + 
    labs(x = "Species", y = "Number of observations")

Result: 如何使ggplot2在数据子集上保持未使用的级别

结果:

#1


2  

The issue here is that a count of zero is not generated for sps = TAST, forage = CF. You can create that count using tidyr::complete. I've also added some dplyr functions to make the code cleaner. Assuming that your data frame is named df1 (as opposed to data, which is a base function name so not a good choice):

这里的问题是没有为sps = TAST,forage = CF生成零计数。您可以使用tidyr :: complete创建该计数。我还添加了一些dplyr函数来使代码更清晰。假设您的数据框名为df1(而不是数据,这是一个基本函数名称,因此不是一个好选择):

UPDATED: with stringsAsFactors = FALSE to address issues in comments.

更新:使用stringsAsFactors = FALSE来解决评论中的问题。

library(dplyr)
library(tidyr)
library(ggplot2)

df1 <- read.table("data.txt", header = TRUE, stringsAsFactors = FALSE)
df1 %>% 
  filter(sps != "MICRO") %>% 
  group_by(sps) %>% 
  count(forage) %>% 
  ungroup %>% 
  complete(sps, forage, fill = list(n = 0)) %>% 
  ggplot(aes(sps, n)) + geom_col(aes(fill = forage), position = "dodge") +
    scale_x_discrete(labels=c("Marmot","American Mink", "Weasel Spp.", "Red squirrel", "Chipmunk")) + 
    theme_classic() + 
    scale_fill_manual(values=c("#000000", "#666666", "#999999","#CCCCCC"), name = "Event") + 
    labs(x = "Species", y = "Number of observations")

Result: 如何使ggplot2在数据子集上保持未使用的级别

结果: