以r为纵向数据的时间索引条件的操作

时间:2021-10-11 23:00:39

I have some data organized in the longitudinal format, i.e.

我有一些纵向格式的数据,即

id <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
time=rep(c(1990, 1995, 2000,2005), 4)
w = runif(16, min=0, max=1)
u = runif(16, min=0, max=0.5)
dat <- cbind(id,time,w,u)
dat
id time         w         u
1 1990 0.6550168 0.2114829
1 1995 0.9669285 0.2253474
1 2000 0.8879138 0.2733263
1 2005 0.1079913 0.4452164
2 1990 0.1483843 0.1949214
2 1995 0.7599596 0.1632965
2 2000 0.7119100 0.3600129
2 2005 0.4164409 0.2456366
3 1990 0.7881798 0.3233312
3 1995 0.8627986 0.1180433
3 2000 0.3253139 0.3491878
3 2005 0.2560138 0.3193816
4 1990 0.2062351 0.3485047
4 1995 0.4145230 0.1413814
4 2000 0.3053510 0.1782681
4 2005 0.7419894 0.3738163

I need to compute B as follows

我需要按如下方式计算B.

以r为纵向数据的时间索引条件的操作

where t and s refers to time. I tried for a loop using two indexes i and jbut I got no output. Then i tried differently, such as

其中t和s指的是时间。我尝试使用两个索引i和j循环但我没有输出。然后我尝试了不同的,例如

B.small = list()
for (r in c(1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010)){
     B.small = (set$w[set$year ==r, ]%*% t(set$w)%*%set$u[set$year ==r, ]*set$u)/n
}
B = sum(B.Small)/n

Error in set$u[set$year == r, ] : incorrect number of dimensions

Bmust be a scalar. I guess there must be an alternative, also without using a loop

B必须是标量。我想必须有一个替代品,也没有使用循环

1 个解决方案

#1


1  

Maybe I didn't understand your formula enough and for sure there could be a better way to do it, but I'd try:

也许我不太了解你的公式,并且肯定有更好的方法可以做到,但我会尝试:

    #split the matrix by year
    splitDT<-split(as.data.frame(dat[,3:4]),dat[,2])
    #build any combinations of indices of year
    indices<-expand.grid(1:length(splitDT),1:length(splitDT))
    #evaluate the mean of each combination and put in a matrix
    res<-matrix(mapply(function(x,y) mean(splitDT[[x]]$u*splitDT[[y]]$u*splitDT[[x]]$w*splitDT[[y]]$w),
                       indices[,1],indices[,2]),
                ncol=length(splitDT))
    #get the result
    sum(res)/ncol(res)

#1


1  

Maybe I didn't understand your formula enough and for sure there could be a better way to do it, but I'd try:

也许我不太了解你的公式,并且肯定有更好的方法可以做到,但我会尝试:

    #split the matrix by year
    splitDT<-split(as.data.frame(dat[,3:4]),dat[,2])
    #build any combinations of indices of year
    indices<-expand.grid(1:length(splitDT),1:length(splitDT))
    #evaluate the mean of each combination and put in a matrix
    res<-matrix(mapply(function(x,y) mean(splitDT[[x]]$u*splitDT[[y]]$u*splitDT[[x]]$w*splitDT[[y]]$w),
                       indices[,1],indices[,2]),
                ncol=length(splitDT))
    #get the result
    sum(res)/ncol(res)