I have a data frame like this, with different observations for each id
:
我有一个这样的数据框,每个id都有不同的观察结果:
library(dplyr)
df <- data.frame(id=c(1,1,1,1,1,2,2,3), v1= rnorm(8), v2=rnorm(8))
I then group by id
:
然后我按ID分组:
by_id <- group_by(df, id)
Now I want to calculate mean and sd of the observations of v1
for each id
. This is easy with summarise
:
现在我想计算每个id的v1观测值的平均值和sd。总结一下这很简单:
df2 <- summarise(by_id,
v1.mean=mean(v1),
v1.sd=sd(v1))
Now I want to add the slope of a linear regression of v1
and v2
现在我想添加v1和v2的线性回归的斜率
df2 <- summarise(by_id,
v1.mean=mean(v1),
v1.sd=sd(v1),
slope=as.vector(coef(lm(v1~v2,na.action="na.omit")[2])))
However, this fails, I think because one person (id=3) has only one observation and thus cannot build a linear model.
然而,这失败了,我认为因为一个人(id = 3)只有一个观察,因此不能建立一个线性模型。
I also tried
我也试过了
slope=ifelse(n()==1,0,as.vector(coef(lm(v1~v2,na.action="na.omit")[2]))))
but it does not work either. Is there an easy solution for this?
但它也不起作用。有一个简单的解决方案吗?
Not that it may also be the case that if I have more than one observation but for example v2
has a missing value, so the lm
might also fail.
并非如果我有多个观察但是例如v2具有缺失值,那么lm也可能失败。
1 个解决方案
#1
6
you can try this
你可以试试这个
group_by(df, id) %>% do(fit = lm(v1~v2, .)) %>% summarise(intercept = coef(fit)[1], slope= coef(fit)[2])
Source: local data frame [3 x 2]
intercept slope
1 -0.3116880 0.2698022
2 -1.2303663 0.4949600
3 0.3169372 NA
note the use of do
and .
inside the lm
function.
注意使用do和。在lm函数里面。
#1
6
you can try this
你可以试试这个
group_by(df, id) %>% do(fit = lm(v1~v2, .)) %>% summarise(intercept = coef(fit)[1], slope= coef(fit)[2])
Source: local data frame [3 x 2]
intercept slope
1 -0.3116880 0.2698022
2 -1.2303663 0.4949600
3 0.3169372 NA
note the use of do
and .
inside the lm
function.
注意使用do和。在lm函数里面。