I have a matrix x
with 20 rows and 10 columns. I need to sample (with replacement) 5 rows at a time and calculate column means. I need to repeat this procedure by 15 times and report the column means for each time.
我有一个矩阵x有20行和10列。我需要每次抽取(替换)5行,并计算列均值。我需要重复这个步骤15次,并且每次报告柱子的意思。
As an example, I used resample library in R
to perform this.
例如,我使用R中的resample库来执行此操作。
# Create a random matrix
library("resample")
set.seed(1234)
x <- matrix( round(rnorm(200, 5)), ncol=10)
## Bootstrap 15 times by re sampling 5 rows at a time.
k <- bootstrap(x,colMeans,B = 15,block.size=5)
My concern with above procedure is that I'm not sure if the rows are kept "in tact", meaning the column means are calculated within the 5 rows selected. The second question is whether, block.size in the above function randomly selects 5 rows randomly and calculates colMeans and repeats this for 15 times and is reported in replicates as shown below?
我对上述过程的关注是,我不确定是否将行保持为“tact”,这意味着列表示在选择的5行中计算。第二个问题是,block。上述函数中的size随机选择5行,计算colMeans并重复15次,并以如下所示的副本报告?
k$replicates
stat1 stat2 stat3 stat4 stat5 stat6 stat7 stat8 stat9 stat10
[1,] 4.65 4.50 4.65 5.25 5.25 5.05 4.90 5.60 4.85 5.20
[2,] 4.60 4.65 4.80 5.60 5.50 5.20 5.05 5.10 5.00 5.40
[3,] 4.90 4.35 4.55 5.20 5.80 4.80 4.60 5.30 5.15 4.20
[4,] 4.75 4.65 4.15 5.30 5.25 4.80 4.70 5.15 5.55 4.35
[5,] 4.55 4.65 4.50 5.40 5.40 4.90 4.85 5.55 5.00 4.75
[6,] 4.65 4.25 5.00 5.35 5.20 5.05 4.95 5.20 4.75 5.20
[7,] 4.70 4.30 4.75 5.35 5.50 4.75 5.00 5.45 4.85 4.75
[8,] 4.75 4.15 4.95 5.10 5.55 4.70 4.70 5.30 5.05 4.90
[9,] 4.40 4.30 4.50 5.25 5.50 4.70 4.75 5.35 4.95 4.85
[10,] 4.85 4.50 4.35 5.25 5.70 4.75 4.65 5.35 4.95 4.10
[11,] 4.35 4.50 4.65 5.30 5.20 4.75 4.85 5.30 5.20 5.20
[12,] 4.25 4.55 5.20 5.00 5.45 4.80 4.90 5.15 5.30 5.00
[13,] 4.30 4.70 4.55 5.05 5.35 4.85 5.00 4.90 5.75 4.60
[14,] 4.70 4.35 4.95 5.25 5.40 4.85 4.90 5.20 5.40 5.20
[15,] 4.55 4.70 4.40 5.15 5.20 4.70 4.80 5.45 6.00 4.90
I'm not specifically restricted to this function or package, any other suggestion would be greatly appreciated.
我并不是专门局限于这个功能或包,任何其他建议都将非常感谢。
Many Thanks
非常感谢
1 个解决方案
#1
2
Without using a package, you could do it like this:
不用软件包,你可以这样做:
# your data
set.seed(1234)
x <- matrix( round(rnorm(200, 5)), ncol=10)
# reset seed for this sampling exercise; define sample size and # iterations
set.seed(1)
samp_size <- 5
iter <- 15
# here are 15 blocks of 5 numbers, which will index rows of your matrix x
samp_mat <- matrix(sample(1:nrow(x), samp_size*iter, replace=T),
ncol=samp_size, byrow=T)
# example, look at the first 3 blocks:
samp_mat[1:3,]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 6 8 12 19 5
# [2,] 18 19 14 13 2
# [3,] 5 4 14 8 16
# so, you can get the colMeans for the first block like this
# (i.e colMeans for rows 6 8 12 19 5, in this case)
colMeans(x[samp_mat[1,],])
# for all 15 blocks:
t(apply(samp_mat, 1, function(i) colMeans(x[i,])))
...and if you want to smush it all into one statement, you can:
…如果你想把这一切都弄成一句话,你可以:
t(apply(matrix(sample(1:nrow(x), 5*15, replace=T), ncol=5, byrow=T), 1,
function(i) colMeans(x[i, ])))
(but that's obv less readable)
(但那是obv可读性较差的)
#1
2
Without using a package, you could do it like this:
不用软件包,你可以这样做:
# your data
set.seed(1234)
x <- matrix( round(rnorm(200, 5)), ncol=10)
# reset seed for this sampling exercise; define sample size and # iterations
set.seed(1)
samp_size <- 5
iter <- 15
# here are 15 blocks of 5 numbers, which will index rows of your matrix x
samp_mat <- matrix(sample(1:nrow(x), samp_size*iter, replace=T),
ncol=samp_size, byrow=T)
# example, look at the first 3 blocks:
samp_mat[1:3,]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 6 8 12 19 5
# [2,] 18 19 14 13 2
# [3,] 5 4 14 8 16
# so, you can get the colMeans for the first block like this
# (i.e colMeans for rows 6 8 12 19 5, in this case)
colMeans(x[samp_mat[1,],])
# for all 15 blocks:
t(apply(samp_mat, 1, function(i) colMeans(x[i,])))
...and if you want to smush it all into one statement, you can:
…如果你想把这一切都弄成一句话,你可以:
t(apply(matrix(sample(1:nrow(x), 5*15, replace=T), ncol=5, byrow=T), 1,
function(i) colMeans(x[i, ])))
(but that's obv less readable)
(但那是obv可读性较差的)