
时间:2022-11-26 14:58:06

Slightly bizarre request, I know, but bear with me.


I have an Excel spreadsheet with some logging data taken from a highly parallelised bit of server-side code. I'm trying to analyse it for where there may be gaps in the logs, indicating tasks that should be logged but aren't; but because it's a serial, timestamp-order list of a dozen or so parallel threads it's quite hard to read. So I had the unorthodox idea of using a Gantt chart to visualise the overlapping tasks. Excel is terrible at this, so I started looking at alternative tools, and I thought of trying R.

我有一个Excel电子表格,其中包含从高度并行化的服务器端代码中获取的一些日志记录数据。我正在尝试分析它在日志中可能存在间隙的地方,指出应该记录但不是的任务;但由于它是十几个并行线程的串行,时间戳顺序列表,因此很难阅读。因此,我有一个非正统的想法,即使用甘特图来显示重叠的任务。 Excel在这方面很糟糕,所以我开始寻找替代工具,我想到了尝试R.

Each task in the log has a start timestamp, and end timestamp, and a duration, so I have the data that I need. I read this SO post and mutilated the example into this R script:


tasks <- c("Task1", "Task2")
dfr <- data.frame(
  name        = factor(tasks, levels = tasks),
  start.date  = c("07/08/2013 09:03:25.815", "07/08/2013 09:03:25.956"),
  end.date    = c("07/08/2013 09:03:28.300", "07/08/2013 09:03:30.409"),
  is.critical = c(TRUE, TRUE)

mdfr <- melt(dfr, measure.vars = c("start.date", "end.date"))

ggplot(mdfr, aes(as.Date(value, "%d/%m/%Y %H:%M:%OS"), name, colour = is.critical)) + 
  geom_line(size = 6) +
  xlab("") + ylab("") +

This doesn't work, though -- it doesn't plot any data, and the time axis is all messed up. I suspect (unsurprisingly) that plotting sub-second Gantt charts is a weird thing to do. I'm a complete R newbie (although I've been looking for an excuse to try it out for ages) -- is there any simple way to make this work?

但这不起作用 - 它不绘制任何数据,时间轴全部搞砸了。我怀疑(毫不奇怪)绘制亚秒级甘特图是一件奇怪的事情。我是一个完整的R新手(虽然我一直在找借口尝试它多年) - 有没有简单的方法来使这项工作?

1 个解决方案



First, your time should be in POSIXct format not Date as it contains also hours and minutes. You can add new column to your melted dataframe with correct format.


mdfr$time<-as.POSIXct(strptime(mdfr$value, "%d/%m/%Y %H:%M:%OS"))

mdfr $ time <-as.POSIXct(strptime(mdfr $ value,“%d /%m /%Y%H:%M:%OS”))

Then with scale_x_datetime() you can control where the breaks will be on axis. For the x values use new column with correct format.


ggplot(mdfr, aes(time,name, colour = is.critical)) + 
  geom_line(size = 6) +
  xlab("") + ylab("") +
  scale_x_datetime(breaks=date_breaks("2 sec"))



First, your time should be in POSIXct format not Date as it contains also hours and minutes. You can add new column to your melted dataframe with correct format.


mdfr$time<-as.POSIXct(strptime(mdfr$value, "%d/%m/%Y %H:%M:%OS"))

mdfr $ time <-as.POSIXct(strptime(mdfr $ value,“%d /%m /%Y%H:%M:%OS”))

Then with scale_x_datetime() you can control where the breaks will be on axis. For the x values use new column with correct format.


ggplot(mdfr, aes(time,name, colour = is.critical)) + 
  geom_line(size = 6) +
  xlab("") + ylab("") +
  scale_x_datetime(breaks=date_breaks("2 sec"))