In an effort to help populate the R tag here, I am posting a few questions I have often received from students. I have developed my own answers to these over the years, but perhaps there are better ways floating around that I don't know about.
为了帮助填充R标签,我发布了一些我经常从学生那里收到的问题。多年来,我已经对这些做出了自己的答案,但也许还有更好的方法,我不知道。
The question: I just ran a regression with continuous y
and x
but factor f
(where levels(f)
produces c("level1","level2")
)
问题:我只是连续y和x但是因子f运行回归(其中level(f)产生c(“level1”,“level2”))
thelm <- lm(y~x*f,data=thedata)
Now I would like to plot the predicted values of y
by x
broken down by groups defined by f
. All of the plots I get are ugly and show too many lines.
现在我想绘制y由x定义的组的x的预测值。我得到的所有情节都很丑陋而且显示的线条太多了。
My answer: Try the predict()
function.
我的回答:尝试使用predict()函数。
##restrict prediction to the valid data
##from the model by using thelm$model rather than thedata
thedata$yhat <- predict(thelm,
newdata=expand.grid(x=range(thelm$model$x),
f=levels(thelm$model$f)))
plot(yhat~x,data=thethedata,subset=f=="level1")
lines(yhat~x,data=thedata,subset=f=="level2")
Are there other ideas out there that are (1) easier to understand for a newcomer and/or (2) better from some other perspective?
是否有其他想法(1)对于新手更容易理解和/或(2)从其他角度更好?
4 个解决方案
#1
17
The effects package has good ploting methods for visualizing the predicted values of regressions.
效果包具有良好的绘图方法,用于可视化回归的预测值。
thedata<-data.frame(x=rnorm(20),f=rep(c("level1","level2"),10))
thedata$y<-rnorm(20,,3)+thedata$x*(as.numeric(thedata$f)-1)
library(effects)
model.lm <- lm(formula=y ~ x*f,data=thedata)
plot(effect(term="x:f",mod=model.lm,default.levels=20),multiline=TRUE)
#2
3
Huh - still trying to wrap my brain around expand.grid()
. Just for comparison's sake, this is how I'd do it (using ggplot2):
嗯 - 仍然试图围绕expand.grid()包裹我的大脑。仅仅为了比较,我就是这样做的(使用ggplot2):
thedata <- data.frame(predict(thelm), thelm$model$x, thelm$model$f)
ggplot(thedata, aes(x = x, y = yhat, group = f, color = f)) + geom_line()
The ggplot() logic is pretty intuitive, I think - group and color the lines by f. With increasing numbers of groups, not having to specify a layer for each is increasingly helpful.
我认为ggplot()逻辑非常直观 - 用f对行进行分组和着色。随着组数量的增加,不必为每个组指定一个层越来越有用。
#3
2
I am no expert in R. But I use:
我不是R.的专家。但我使用:
xyplot(y ~ x, groups= f, data= Dat, type= c('p','r'),
grid= T, lwd= 3, auto.key= T,)
This is also an option:
这也是一个选择:
interaction.plot(f,x,y, type="b", col=c(1:3),
leg.bty="0", leg.bg="beige", lwd=1, pch=c(18,24),
xlab="",
ylab="",
trace.label="",
main="Interaction Plot")
#4
0
Here is a small change to the excellent suggestion by Matt and a solution similar to Helgi but with ggplot. Only difference from above is that I have used the geom_smooth(method='lm) which plots regression lines directly.
这是对马特的优秀建议和类似于Helgi的解决方案的一个小改动,但是使用了ggplot。与上面的区别仅在于我使用了geom_smooth(method ='lm),它直接绘制了回归线。
set.seed(1)
y = runif(100,1,10)
x = runif(100,1,10)
f = rep(c('level 1','level 2'),50)
thedata = data.frame(x,y,f)
library(ggplot2)
ggplot(thedata,aes(x=x,y=y,color=f))+geom_smooth(method='lm',se=F)
#1
17
The effects package has good ploting methods for visualizing the predicted values of regressions.
效果包具有良好的绘图方法,用于可视化回归的预测值。
thedata<-data.frame(x=rnorm(20),f=rep(c("level1","level2"),10))
thedata$y<-rnorm(20,,3)+thedata$x*(as.numeric(thedata$f)-1)
library(effects)
model.lm <- lm(formula=y ~ x*f,data=thedata)
plot(effect(term="x:f",mod=model.lm,default.levels=20),multiline=TRUE)
#2
3
Huh - still trying to wrap my brain around expand.grid()
. Just for comparison's sake, this is how I'd do it (using ggplot2):
嗯 - 仍然试图围绕expand.grid()包裹我的大脑。仅仅为了比较,我就是这样做的(使用ggplot2):
thedata <- data.frame(predict(thelm), thelm$model$x, thelm$model$f)
ggplot(thedata, aes(x = x, y = yhat, group = f, color = f)) + geom_line()
The ggplot() logic is pretty intuitive, I think - group and color the lines by f. With increasing numbers of groups, not having to specify a layer for each is increasingly helpful.
我认为ggplot()逻辑非常直观 - 用f对行进行分组和着色。随着组数量的增加,不必为每个组指定一个层越来越有用。
#3
2
I am no expert in R. But I use:
我不是R.的专家。但我使用:
xyplot(y ~ x, groups= f, data= Dat, type= c('p','r'),
grid= T, lwd= 3, auto.key= T,)
This is also an option:
这也是一个选择:
interaction.plot(f,x,y, type="b", col=c(1:3),
leg.bty="0", leg.bg="beige", lwd=1, pch=c(18,24),
xlab="",
ylab="",
trace.label="",
main="Interaction Plot")
#4
0
Here is a small change to the excellent suggestion by Matt and a solution similar to Helgi but with ggplot. Only difference from above is that I have used the geom_smooth(method='lm) which plots regression lines directly.
这是对马特的优秀建议和类似于Helgi的解决方案的一个小改动,但是使用了ggplot。与上面的区别仅在于我使用了geom_smooth(method ='lm),它直接绘制了回归线。
set.seed(1)
y = runif(100,1,10)
x = runif(100,1,10)
f = rep(c('level 1','level 2'),50)
thedata = data.frame(x,y,f)
library(ggplot2)
ggplot(thedata,aes(x=x,y=y,color=f))+geom_smooth(method='lm',se=F)