R:用ggplot2绘制一个有分位数的时间序列

时间:2022-07-22 14:57:15

I need to plot a time series with ggplot2. For each point of the time series I also have some quantiles, say 0.05, 0.25, 0.75, 0.95, i.e. I have five data for each point. For example:

我需要和ggplot2一起绘制一个时间序列。对于时间序列的每一点,我都有一些分位数,比如0。05 0。25 0。75 0。95,也就是说,每一点有5个数据。例如:

time           quantile=0.05  quantile=0.25 quantile=0.5  quantile=0.75   quantile=0.95
00:01          623.0725       630.4353      903.8870       959.1407       1327.721
00:02          623.0944       631.3707      911.9967      1337.4564       1518.539
00:03          623.0725       630.4353      903.8870      1170.8316       1431.893
00:04          623.0725       630.4353      903.8870      1336.3212       1431.893
00:05          623.0835       631.3557      905.4220      1079.6623       1452.260
00:06          623.0835       631.3557      905.4220      1079.6623       1452.260
00:07          623.0835       631.3557      905.4220      1079.6623       1452.260
00:08          623.0780       631.3483      905.3496      1056.3719       1375.610
00:09          623.0671       630.4275      903.8839      1170.8196       1356.963
00:10          623.0507       630.0261      741.8475      1006.1208       1462.271

Ideally, I would like to have the 0.5 quantile as a black line and the others as shaded color intervals surrounding the black line. What's the best way to do this? I've been looking around with no luck, I can't find examples of this, even less with ggplot2.

理想情况下,我希望0.5分位数作为黑色线,其他的作为黑色线周围的阴影色间隔。最好的方法是什么?我一直在四处寻找,运气不好,我找不到这样的例子,更别提ggplot2了。

Any help would be appreciated.

如有任何帮助,我们将不胜感激。

Salud!

祝您健康!

2 个解决方案

#1


9  

Does this do what you want? The trick to ggplot is understanding that it expects data in long format. This often means that we have to transform the data before it is ready to plot, usually with melt().

这是你想要的吗?ggplot的诀窍是理解它期望数据以长格式。这通常意味着我们必须在数据准备绘制之前对其进行转换,通常是使用melt()。

After reading your data in with textConnection() and creating an object named dat, here are the steps you'd take:

在用textConnection()读取数据并创建一个名为dat的对象之后,下面是您将采取的步骤:

#Melt into long format 
dat.m <- melt(dat, id.vars = "time")

#Not necessary, but if you want different line types depending on quantile, here's how I'd do it
dat.m <- within(dat.m
  , lty <- ifelse(variable == "quantile.0.5", 1
    , ifelse(variable %in% c("quantile.0.25", "quantile.0.75"),2,3)
    )
)

#plot it
ggplot(dat.m, aes(time, value, group = variable, colour = variable, linetype = lty)) + 
  geom_line() +
  scale_colour_manual(name = "", values = c("red", "blue", "black", "blue", "red"))

Gives you:

给你:

R:用ggplot2绘制一个有分位数的时间序列

After reading your question again, maybe you want shaded ribbons outside the median estimate instead of lines? If so, give this a whirl. The only real trick here is that we pass group = 1 as an aesthetic so that geom_line() will behave properly with factor / character data. Previously, we grouped by the variable which served the same effect. Also note that we are no longer using the melted data.frame, as the wide data.frame will suit us just fine in this case.

在再次阅读你的问题之后,也许你想要在中值估计值之外的阴影带而不是线?如果是的话,试一试。这里唯一的真正的技巧是,我们将group = 1作为一个审美对象,这样就可以使用factor / character数据来正确地执行地行()。之前,我们按照同样效果的变量进行分组。还要注意的是,我们不再使用熔化的数据。frame,因为在这种情况下,frame非常适合我们。

ggplot(dat, aes(x = time, group = 1)) +
  geom_ribbon(aes(ymin = quantile.0.05, ymax = quantile.0.95, fill = "05%-95%"), alpha = .25) + 
  geom_ribbon(aes(ymin = quantile.0.25, ymax = quantile.0.75, fill = "25%-75%"), alpha = .25) +
  geom_line(aes(y = quantile.0.5)) +
  scale_fill_manual(name = "", values = c("25%-75%" = "red", "05%-95%" = "blue")) 

R:用ggplot2绘制一个有分位数的时间序列

Edit: To force a legend for the predicted value

编辑:强制一个图例为预测值

We can use the same approach we used for the geom_ribbon() layers. We'll add an aesthetic to geom_line() and then set the values of that aesthetic with scale_colour_manual():

我们可以使用与geom_ribbon()层相同的方法。我们将在geom_line()中添加审美观,然后使用scale_color_manual()设置审美观的值:

ggplot(dat, aes(x = time, group = 1)) +
  geom_ribbon(aes(ymin = quantile.0.05, ymax = quantile.0.95, fill = "05%-95%"), alpha = .25) + 
  geom_ribbon(aes(ymin = quantile.0.25, ymax = quantile.0.75, fill = "25%-75%"), alpha = .25) +
  geom_line(aes(y = quantile.0.5, colour = "Predicted")) +
  scale_fill_manual(name = "", values = c("25%-75%" = "red", "05%-95%" = "blue")) +
  scale_colour_manual(name = "", values = c("Predicted" = "black"))

There may be more efficient ways to do that, but that's the way I've always used and have had pretty good success with it. YMMV.

也许有更有效的方法来实现这一点,但这是我一直使用的方法,并且已经取得了很大的成功。YMMV。

#2


5  

Assuming your dat.frame is called df:

假设你的日期。框架被称为df:

The easiest ggplot solution is to use the boxplot geom. This gives a black central line with filled boxes to the middle and upper positions.

最简单的ggplot解决方案是使用boxplot geom。这将为中间和上面的位置提供一个黑色的中心线,其中填充了方框。

Since you have pre-summarised your data, it is important to specify the stat="identity" parameter:

既然您已经预先总结了您的数据,那么指定stat="identity"参数是很重要的:

ggplot(df, aes(x=time)) + 
    geom_boxplot(
        aes(
          lower=quantile.0.25, 
          upper=quantile.0.75,
          middle=quantile.0.5,
          ymin=quantile.0.05,
          ymax=quantile.0.95
        ), 
        stat="identity",
        fill = "cyan"
)

R:用ggplot2绘制一个有分位数的时间序列

PS. I recreated your data as follows:

PS.我重新创建了您的数据如下:

df <- "time           quantile=0.05  quantile=0.25 quantile=0.5  quantile=0.75   quantile=0.95
00:01          623.0725       630.4353      903.8870       959.1407       1327.721
00:02          623.0944       631.3707      911.9967      1337.4564       1518.539
00:03          623.0725       630.4353      903.8870      1170.8316       1431.893
00:04          623.0725       630.4353      903.8870      1336.3212       1431.893
00:05          623.0835       631.3557      905.4220      1079.6623       1452.260
00:06          623.0835       631.3557      905.4220      1079.6623       1452.260
00:07          623.0835       631.3557      905.4220      1079.6623       1452.260
00:08          623.0780       631.3483      905.3496      1056.3719       1375.610
00:09          623.0671       630.4275      903.8839      1170.8196       1356.963
00:10          623.0507       630.0261      741.8475      1006.1208       1462.271"

df <- read.table(textConnection(df), header=TRUE)

#1


9  

Does this do what you want? The trick to ggplot is understanding that it expects data in long format. This often means that we have to transform the data before it is ready to plot, usually with melt().

这是你想要的吗?ggplot的诀窍是理解它期望数据以长格式。这通常意味着我们必须在数据准备绘制之前对其进行转换,通常是使用melt()。

After reading your data in with textConnection() and creating an object named dat, here are the steps you'd take:

在用textConnection()读取数据并创建一个名为dat的对象之后,下面是您将采取的步骤:

#Melt into long format 
dat.m <- melt(dat, id.vars = "time")

#Not necessary, but if you want different line types depending on quantile, here's how I'd do it
dat.m <- within(dat.m
  , lty <- ifelse(variable == "quantile.0.5", 1
    , ifelse(variable %in% c("quantile.0.25", "quantile.0.75"),2,3)
    )
)

#plot it
ggplot(dat.m, aes(time, value, group = variable, colour = variable, linetype = lty)) + 
  geom_line() +
  scale_colour_manual(name = "", values = c("red", "blue", "black", "blue", "red"))

Gives you:

给你:

R:用ggplot2绘制一个有分位数的时间序列

After reading your question again, maybe you want shaded ribbons outside the median estimate instead of lines? If so, give this a whirl. The only real trick here is that we pass group = 1 as an aesthetic so that geom_line() will behave properly with factor / character data. Previously, we grouped by the variable which served the same effect. Also note that we are no longer using the melted data.frame, as the wide data.frame will suit us just fine in this case.

在再次阅读你的问题之后,也许你想要在中值估计值之外的阴影带而不是线?如果是的话,试一试。这里唯一的真正的技巧是,我们将group = 1作为一个审美对象,这样就可以使用factor / character数据来正确地执行地行()。之前,我们按照同样效果的变量进行分组。还要注意的是,我们不再使用熔化的数据。frame,因为在这种情况下,frame非常适合我们。

ggplot(dat, aes(x = time, group = 1)) +
  geom_ribbon(aes(ymin = quantile.0.05, ymax = quantile.0.95, fill = "05%-95%"), alpha = .25) + 
  geom_ribbon(aes(ymin = quantile.0.25, ymax = quantile.0.75, fill = "25%-75%"), alpha = .25) +
  geom_line(aes(y = quantile.0.5)) +
  scale_fill_manual(name = "", values = c("25%-75%" = "red", "05%-95%" = "blue")) 

R:用ggplot2绘制一个有分位数的时间序列

Edit: To force a legend for the predicted value

编辑:强制一个图例为预测值

We can use the same approach we used for the geom_ribbon() layers. We'll add an aesthetic to geom_line() and then set the values of that aesthetic with scale_colour_manual():

我们可以使用与geom_ribbon()层相同的方法。我们将在geom_line()中添加审美观,然后使用scale_color_manual()设置审美观的值:

ggplot(dat, aes(x = time, group = 1)) +
  geom_ribbon(aes(ymin = quantile.0.05, ymax = quantile.0.95, fill = "05%-95%"), alpha = .25) + 
  geom_ribbon(aes(ymin = quantile.0.25, ymax = quantile.0.75, fill = "25%-75%"), alpha = .25) +
  geom_line(aes(y = quantile.0.5, colour = "Predicted")) +
  scale_fill_manual(name = "", values = c("25%-75%" = "red", "05%-95%" = "blue")) +
  scale_colour_manual(name = "", values = c("Predicted" = "black"))

There may be more efficient ways to do that, but that's the way I've always used and have had pretty good success with it. YMMV.

也许有更有效的方法来实现这一点,但这是我一直使用的方法,并且已经取得了很大的成功。YMMV。

#2


5  

Assuming your dat.frame is called df:

假设你的日期。框架被称为df:

The easiest ggplot solution is to use the boxplot geom. This gives a black central line with filled boxes to the middle and upper positions.

最简单的ggplot解决方案是使用boxplot geom。这将为中间和上面的位置提供一个黑色的中心线,其中填充了方框。

Since you have pre-summarised your data, it is important to specify the stat="identity" parameter:

既然您已经预先总结了您的数据,那么指定stat="identity"参数是很重要的:

ggplot(df, aes(x=time)) + 
    geom_boxplot(
        aes(
          lower=quantile.0.25, 
          upper=quantile.0.75,
          middle=quantile.0.5,
          ymin=quantile.0.05,
          ymax=quantile.0.95
        ), 
        stat="identity",
        fill = "cyan"
)

R:用ggplot2绘制一个有分位数的时间序列

PS. I recreated your data as follows:

PS.我重新创建了您的数据如下:

df <- "time           quantile=0.05  quantile=0.25 quantile=0.5  quantile=0.75   quantile=0.95
00:01          623.0725       630.4353      903.8870       959.1407       1327.721
00:02          623.0944       631.3707      911.9967      1337.4564       1518.539
00:03          623.0725       630.4353      903.8870      1170.8316       1431.893
00:04          623.0725       630.4353      903.8870      1336.3212       1431.893
00:05          623.0835       631.3557      905.4220      1079.6623       1452.260
00:06          623.0835       631.3557      905.4220      1079.6623       1452.260
00:07          623.0835       631.3557      905.4220      1079.6623       1452.260
00:08          623.0780       631.3483      905.3496      1056.3719       1375.610
00:09          623.0671       630.4275      903.8839      1170.8196       1356.963
00:10          623.0507       630.0261      741.8475      1006.1208       1462.271"

df <- read.table(textConnection(df), header=TRUE)