使用直方图中断在第二列上应用函数

时间:2022-06-03 14:57:49

I have multiple columns of data, let's say x and time. I want to make a histogram of column x, and color each bar based off an aggregation of the values in column time, where the aggregation is grouped by the breaks used for the histogram. So,

我有多列数据,比方说x和时间。我想制作列x的直方图,并根据列时间中的值的聚合为每个条形图着色,其中聚合按照用于直方图的间隔进行分组。所以,

d = cbind(c(rep(1,3), rep(2,3)), c(10,20,10,20,10,20))
names(d) = c("x", "time")
hist(d[,"x"])

Gives me a nice barplot, and let's say I want something like this for my colors:

给了我一个漂亮的条形图,让我们说我想要这样的颜色:

palette(rainbow(25))
hist(d[,"x"], col=d[,"time"], n=10)

I would like to have the col be a vector of length 10 that is an aggregated function (such as mean) of the time column.

我想将col作为长度为10的向量,它是时间列的聚合函数(例如均值)。

2 个解决方案

#1


1  

I would do this with plyr and ggplot2:

我会用plyr和ggplot2做到这一点:

require(plyr)
require(ggplot2)

d <- data.frame(x=c(rep(1:4, each=4)), time=sample(10:100, 16, replace=T))
d <- ddply(d, .(x), transform, mean.time=mean(time))

ggplot(d, aes(x=x, group=x, fill=mean.time)) +
  geom_histogram()

使用直方图中断在第二列上应用函数

#2


0  

If I correctly understood, you would like to average time values over each x and plot a histogram. But which colour do you want to use? Gradient or individual, based on mean time values or on x values?

如果我正确理解,您希望平均每个x的时间值并绘制直方图。但是你想用哪种颜色?基于平均时间值或x值的梯度或个体?

Consider this example as a starting point

以此示例为出发点

require(ggplot2)
d <- data.frame(x=c(rep(1:4, each=4)), time=sample(10:100, 16, replace=T)) # thanks to Andy :)
ggplot(d, aes(x=factor(x), y=time)) + 
stat_summary(fun.y="mean", geom="bar", aes(fill=factor(d$x)))

or

要么

ggplot(d, aes(x=factor(x), y=time)) + 
stat_summary(fun.y="mean", geom="bar", aes(fill=d$x))

#1


1  

I would do this with plyr and ggplot2:

我会用plyr和ggplot2做到这一点:

require(plyr)
require(ggplot2)

d <- data.frame(x=c(rep(1:4, each=4)), time=sample(10:100, 16, replace=T))
d <- ddply(d, .(x), transform, mean.time=mean(time))

ggplot(d, aes(x=x, group=x, fill=mean.time)) +
  geom_histogram()

使用直方图中断在第二列上应用函数

#2


0  

If I correctly understood, you would like to average time values over each x and plot a histogram. But which colour do you want to use? Gradient or individual, based on mean time values or on x values?

如果我正确理解,您希望平均每个x的时间值并绘制直方图。但是你想用哪种颜色?基于平均时间值或x值的梯度或个体?

Consider this example as a starting point

以此示例为出发点

require(ggplot2)
d <- data.frame(x=c(rep(1:4, each=4)), time=sample(10:100, 16, replace=T)) # thanks to Andy :)
ggplot(d, aes(x=factor(x), y=time)) + 
stat_summary(fun.y="mean", geom="bar", aes(fill=factor(d$x)))

or

要么

ggplot(d, aes(x=factor(x), y=time)) + 
stat_summary(fun.y="mean", geom="bar", aes(fill=d$x))