I'm creating a facetted plot to view predicted vs. actual values side by side with a plot of predicted value vs. residuals. I'll be using shiny
to help explore the results of modeling efforts using different training parameters. I train the model with 85% of the data, test on the remaining 15%, and repeat this 5 times, collecting actual/predicted values each time. After calculating the residuals, my data.frame
looks like this:
我正在创建一个facetted的图表,以预测的价值与实际值的对比,并以预测的价值和剩余价值为基础。我将使用闪亮的来帮助探索使用不同训练参数的建模工作的结果。我用85%的数据对模型进行训练,对其余15%进行测试,重复5次,每次收集实际/预测值。在计算剩余值之后,我的数据a框架是这样的:
head(results)
act pred resid
2 52.81000 52.86750 -0.05750133
3 44.46000 42.76825 1.69175252
4 54.58667 49.00482 5.58184181
5 36.23333 35.52386 0.70947731
6 53.22667 48.79429 4.43237981
7 41.72333 41.57504 0.14829173
What I want:
我想要的:
- Side by side plot of
pred
vs.act
andpred
vs.resid
- 在pred vs. act和pred vs. resid的旁边。
- The x/y range/limits for
pred
vs.act
to be the same, ideally frommin(min(results$act), min(results$pred))
tomax(max(results$act), max(results$pred))
- pred与act的x/y范围/限制是相同的,理想情况下是min(min(结果$act), min(结果$pred))到max(max(结果$act), max(结果$pred))
- The x/y range/limits for
pred
vs.resid
not to be affected by what I do to the actual vs. predicted plot. Plotting forx
over only the predicted values andy
over only the residual range is fine. - pred和resid的x/y范围/限制不受我对实际vs预测的情节的影响。用x除以预测值和y除以仅剩下的范围就可以了。
In order to view both plots side by side, I melt the data:
为了将两个图并排地看,我将数据熔化:
library(reshape2)
plot <- melt(results, id.vars = "pred")
Now plot:
现在情节:
library(ggplot2)
p <- ggplot(plot, aes(x = pred, y = value)) + geom_point(size = 2.5) + theme_bw()
p <- p + facet_wrap(~variable, scales = "free")
print(p)
That's pretty close to what I want:
这和我想要的很接近:
What I'd like is for the x and y ranges for actual vs. predicted to be the same, but I'm not sure how to specify that, and I don't need that done for the predicted vs. residual plot since the ranges are completely different.
我想要的是,x和y的取值范围是相同的,但我不确定如何具体说明,我不需要做预测和剩余的图因为范围是完全不同的。
I tried adding something like this for both scale_x_continous
and scale_y_continuous
:
我尝试在scale_x_continous和scale_y_continuous中添加这样的东西:
min_xy <- min(min(plot$pred), min(plot$value))
max_xy <- max(max(plot$pred), max(plot$value))
p <- ggplot(plot, aes(x = pred, y = value)) + geom_point(size = 2.5) + theme_bw()
p <- p + facet_wrap(~variable, scales = "free")
p <- p + scale_x_continuous(limits = c(min_xy, max_xy))
p <- p + scale_y_continuous(limits = c(min_xy, max_xy))
print(p)
But that picks up the min()
of the residual values.
但是,这将获取剩余值的min()。
One last idea I had is to store the value of the minimum act
and pred
variables before melting, and then add them to the melted data frame in order to dictate in which facet they appear:
我的最后一个想法是,在熔化之前存储最小行为和pred变量的值,然后将它们添加到熔化的数据框架中,以指示它们出现在哪个方面:
head(results)
act pred resid
2 52.81000 52.86750 -0.05750133
3 44.46000 42.76825 1.69175252
4 54.58667 49.00482 5.58184181
5 36.23333 35.52386 0.70947731
min_xy <- min(min(results$act), min(results$pred))
max_xy <- max(max(results$act), max(results$pred))
plot <- melt(results, id.vars = "pred")
plot <- rbind(plot, data.frame(pred = c(min_xy, max_xy),
variable = c("act", "act"), value = c(max_xy, min_xy)))
p <- ggplot(plot, aes(x = pred, y = value)) + geom_point(size = 2.5) + theme_bw()
p <- p + facet_wrap(~variable, scales = "free")
print(p)
That does what I want, with the exception that the points show up, too:
这就是我想要的,但也有例外:
Any suggestions for doing something like this?
有什么建议吗?
I saw this idea to add geom_blank()
, but I'm not sure how to specify the aes()
bit and have it work properly, or what the geom_point()
equivalent is to the histogram use of aes(y = max(..count..))
.
我看到这个想法是要添加地_blank(),但是我不确定如何指定aes()位,并使它正常工作,或者是什么地()等效于使用aes的直方图(y = max(. count.. ..))。
Here's data to play with (my actual, predicted, and residual values prior to melting):
这里的数据与(我的实际,预测和剩余价值在融化之前):
> dput(results)
structure(list(act = c(52.81, 44.46, 54.5866666666667, 36.2333333333333,
53.2266666666667, 41.7233333333333, 35.2966666666667, 30.6833333333333,
39.25, 35.8866666666667, 25.1, 29.0466666666667, 23.2766666666667,
56.3866666666667, 42.92, 41.57, 27.92, 23.16, 38.0166666666667,
61.8966666666667, 37.41, 41.6333333333333, 35.9466666666667,
48.9933333333333, 30.5666666666667, 32.08, 40.3633333333333,
53.2266666666667, 64.6066666666667, 38.5366666666667, 41.7233333333333,
25.78, 33.4066666666667, 27.8033333333333, 39.3266666666667,
48.9933333333333, 25.2433333333333, 32.67, 55.17, 42.92, 54.5866666666667,
23.16, 64.6066666666667, 40.7966666666667, 39.0166666666667,
41.6333333333333, 35.8866666666667, 25.1, 23.2766666666667, 44.46,
34.2166666666667, 40.8033333333333, 24.5766666666667, 35.73,
61.8966666666667, 62.1833333333333, 74.6466666666667, 39.4366666666667,
36.6, 27.1333333333333), pred = c(52.8675013282404, 42.7682474758679,
49.0048248585123, 35.5238560262515, 48.7942868566949, 41.5750416040131,
33.9548164913007, 29.9787449128663, 37.6443975781139, 36.7196211666685,
27.6043278172077, 27.0615724310721, 31.2073056885252, 55.0886903524179,
43.0895814712768, 43.0895814712768, 32.3549865881578, 26.2428426737583,
36.6926037128343, 56.7987490221996, 45.0370788180147, 41.8231642271826,
38.3297859332601, 49.5343916620086, 30.8535641206809, 29.0117492750411,
36.9767968381391, 49.0826677983065, 54.4678549541069, 35.5059204731218,
41.5333417555995, 27.6069075391361, 31.2404889715121, 27.8920960978598,
37.8505531149324, 49.2616631533957, 30.366837650159, 31.1623492639066,
55.0456078770405, 42.772538591063, 49.2419293590535, 26.1963523976241,
54.4080781796616, 44.9796700541254, 34.6996927469131, 41.6227713664027,
36.8449646519306, 27.5318686661673, 31.6641793552795, 42.8198894266632,
40.5769177148146, 40.5769177148146, 29.3807781312816, 36.8579132935989,
55.5617033901752, 55.8097119335638, 55.1041728261666, 43.6094641699075,
37.0674887276681, 27.3876960746536), resid = c(-0.0575013282403773,
1.69175252413213, 5.58184180815435, 0.709477307081826, 4.43237980997177,
0.148291729320228, 1.34185017536599, 0.704588420467079, 1.60560242188613,
-0.832954500001826, -2.50432781720766, 1.98509423559461, -7.93063902185855,
1.29797631424874, -0.169581471276786, -1.51958147127679, -4.43498658815778,
-3.08284267375831, 1.32406295383237, 5.09791764446704, -7.62707881801468,
-0.189830893849219, -2.38311926659339, -0.541058328675241, -0.286897454014273,
3.06825072495888, 3.38653649519422, 4.14399886836018, 10.1388117125598,
3.03074619354486, 0.189991577733821, -1.82690753913609, 2.16617769515461,
-0.088762764526507, 1.47611355173427, -0.268329820062384, -5.12350431682565,
1.5076507360934, 0.124392122959534, 0.147461408936991, 5.34473730761318,
-3.03635239762411, 10.1985884870051, -4.18300338745873, 4.31697391975358,
0.0105619669306023, -0.958297985263961, -2.43186866616734, -8.38751268861282,
1.64011057333683, -6.36025104814794, 0.226415618518729, -4.80411146461488,
-1.1279132935989, 6.33496327649151, 6.37362139976954, 19.5424938405001,
-4.17279750324084, -0.467488727668119, -0.254362741320246)), .Names = c("act",
"pred", "resid"), row.names = c(2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L,
24L, 25L, 26L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L,
38L, 39L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L,
52L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L
), class = "data.frame")
3 个解决方案
#1
47
Here's some code with a dummy geom_blank
layer,
下面是一些代码,其中有一个伪地空层,
range_act <- range(range(results$act), range(results$pred))
d <- reshape2::melt(results, id.vars = "pred")
dummy <- data.frame(pred = range_act, value = range_act,
variable = "act", stringsAsFactors=FALSE)
ggplot(d, aes(x = pred, y = value)) +
facet_wrap(~variable, scales = "free") +
geom_point(size = 2.5) +
geom_blank(data=dummy) +
theme_bw()
#2
7
I am not sure I understand what you want, but based on what I understood
我不确定我是否理解你想要什么,但我的理解是基于我的理解。
the x scale seems to be the same, it is the y scale that is not the same, and that is because you specified scales ="free"
x的刻度看起来是一样的,这是y刻度,这是不一样的,这是因为你指定的刻度="free"
you can specify scales = "free_x" to only allow x to be free (in this case it is the same as pred has the same range by definition)
您可以指定scale = "free_x",只允许x是免费的(在这种情况下,它与pred的定义相同)
p <- ggplot(plot, aes(x = pred, y = value)) + geom_point(size = 2.5) + theme_bw()
p <- p + facet_wrap(~variable, scales = "free_x")
worked for me, see the picture
为我工作,看照片。
I think you were making it too difficult - I do seem to remember one time defining the limits based on a formula with min and max and if faceted I think it used only those values, but I can't find the code
我觉得你让它变得太困难了——我似乎记得有一次,根据一个公式定义了一个基于最小和最大值的公式,如果我认为它只使用那些值,但是我找不到代码。
#3
2
You can also specify the range with the coord_cartesian command to set the y-axis range that you want, an like in the previous post use scales = free_x
您还可以使用coord_cartesian命令指定范围来设置您想要的y轴范围,就像前面的post使用scale = free_x一样。
p <- ggplot(plot, aes(x = pred, y = value)) +
geom_point(size = 2.5) +
theme_bw()+
coord_cartesian(ylim = c(-20, 80))
p <- p + facet_wrap(~variable, scales = "free_x")
p
#1
47
Here's some code with a dummy geom_blank
layer,
下面是一些代码,其中有一个伪地空层,
range_act <- range(range(results$act), range(results$pred))
d <- reshape2::melt(results, id.vars = "pred")
dummy <- data.frame(pred = range_act, value = range_act,
variable = "act", stringsAsFactors=FALSE)
ggplot(d, aes(x = pred, y = value)) +
facet_wrap(~variable, scales = "free") +
geom_point(size = 2.5) +
geom_blank(data=dummy) +
theme_bw()
#2
7
I am not sure I understand what you want, but based on what I understood
我不确定我是否理解你想要什么,但我的理解是基于我的理解。
the x scale seems to be the same, it is the y scale that is not the same, and that is because you specified scales ="free"
x的刻度看起来是一样的,这是y刻度,这是不一样的,这是因为你指定的刻度="free"
you can specify scales = "free_x" to only allow x to be free (in this case it is the same as pred has the same range by definition)
您可以指定scale = "free_x",只允许x是免费的(在这种情况下,它与pred的定义相同)
p <- ggplot(plot, aes(x = pred, y = value)) + geom_point(size = 2.5) + theme_bw()
p <- p + facet_wrap(~variable, scales = "free_x")
worked for me, see the picture
为我工作,看照片。
I think you were making it too difficult - I do seem to remember one time defining the limits based on a formula with min and max and if faceted I think it used only those values, but I can't find the code
我觉得你让它变得太困难了——我似乎记得有一次,根据一个公式定义了一个基于最小和最大值的公式,如果我认为它只使用那些值,但是我找不到代码。
#3
2
You can also specify the range with the coord_cartesian command to set the y-axis range that you want, an like in the previous post use scales = free_x
您还可以使用coord_cartesian命令指定范围来设置您想要的y轴范围,就像前面的post使用scale = free_x一样。
p <- ggplot(plot, aes(x = pred, y = value)) +
geom_point(size = 2.5) +
theme_bw()+
coord_cartesian(ylim = c(-20, 80))
p <- p + facet_wrap(~variable, scales = "free_x")
p