Possible Duplicate:
How do I randomly select k points from N points in MATLAB?可能的复制:我如何从MATLAB的N点随机选择k点?
Let's say I have a dataset that includes 10,000 rows of data. What is the best way to create a subset that includes 1,000 randomly chosen rows?
假设我有一个包含10,000行数据的数据集。创建包含1000个随机选择的行的子集的最佳方法是什么?
4 个解决方案
#1
32
You can use randperm for this task:
你可以使用randperm来完成这个任务:
Sampling without replacement:
不重复抽样:
nRows = 10000; % number of rows
nSample = 1000; % number of samples
rndIDX = randperm(nRows);
newSample = data(rndIDX(1:nSample), :);
Sampling with replacement:
放回抽样:
nRows = 10000; % number of rows
nSample = 1000; % number of samples
rndIDX = randi(nRows, nSample, 1);
newSample = data(rndIDX, :);
#2
6
Use randperm
in combination with the number of rows. If x
is your matrix:
使用randperm与行数相结合。如果x是你的矩阵
nrows = size(x,1);
nrand = 1000; % Choose 1000 rows
assert(nrand<=nrows, 'You cannot choose more rows than exist in the matrix');
rand_rows = randperm(nrows, nrand);
xx = x(rand_rows,:); % Select the random rows from x
#3
4
If you have the statistics toolbox R2012+, you can use datasample.
如果您有统计工具箱R2012+,您可以使用datasample。
subset = datasample(data,1000)
subset
will be a randomly selected subset of data consisting of 1000 samples.
子集将是一个随机选择的数据子集,包含1000个样本。
To sample without replacement, use:
不更换样品,使用:
subset = datasample(data,1000,'Replace',false)
If you have an older version of the toolbox, you can use randsample:
如果你有一个旧版本的工具箱,你可以使用randsample:
rndIdx = randsample(size(data,1),1000,true); % with replacement
subset = samples(rndIdx(1:1000), :);
rndIdx = randsample(size(data,1),1000,false); % without replacement
subset = samples(rndIdx(1:1000), :);
But using randsample is more or less the same as H.Muster's answer (which I have accepted as the best because it doesn't require any toolbox).
但使用randsample或多或少与H相同。集合的答案(我认为它是最好的,因为它不需要任何工具箱)。
Note: For more info on the difference between sampling with replacement vs. sampling without replacement, see this page.
注意:如果想要更多的信息,在替换的抽样和没有替换的抽样之间的区别,请参阅这一页。
#4
1
Not sure if you written any code so far. The following mathworks link shows examples of random sampling. Take a look at it for ideas.
目前还不确定是否编写了任何代码。下面的mathworks链接显示了随机抽样的例子。看看它的想法。
Also a code here with randsample from statistics toolbox. Just a logic and you may have to adjust it accordingly.
这里还有一个来自统计工具箱的randsample的代码。只是一个逻辑,你可能需要相应地调整它。
matrix m of N rows pull a random sample of n rows from m
矩阵m (N)行从m取N行随机抽样。
Sample = m(randsample(1:N,n),:)
示例= m(randsample(1:N,N):)
randsample(1:N,n)
randsample(1:N,N)
Above results in a sequence of n random integers from 1 to N.
上面的结果是一个n个随机整数序列,从1到n。
#1
32
You can use randperm for this task:
你可以使用randperm来完成这个任务:
Sampling without replacement:
不重复抽样:
nRows = 10000; % number of rows
nSample = 1000; % number of samples
rndIDX = randperm(nRows);
newSample = data(rndIDX(1:nSample), :);
Sampling with replacement:
放回抽样:
nRows = 10000; % number of rows
nSample = 1000; % number of samples
rndIDX = randi(nRows, nSample, 1);
newSample = data(rndIDX, :);
#2
6
Use randperm
in combination with the number of rows. If x
is your matrix:
使用randperm与行数相结合。如果x是你的矩阵
nrows = size(x,1);
nrand = 1000; % Choose 1000 rows
assert(nrand<=nrows, 'You cannot choose more rows than exist in the matrix');
rand_rows = randperm(nrows, nrand);
xx = x(rand_rows,:); % Select the random rows from x
#3
4
If you have the statistics toolbox R2012+, you can use datasample.
如果您有统计工具箱R2012+,您可以使用datasample。
subset = datasample(data,1000)
subset
will be a randomly selected subset of data consisting of 1000 samples.
子集将是一个随机选择的数据子集,包含1000个样本。
To sample without replacement, use:
不更换样品,使用:
subset = datasample(data,1000,'Replace',false)
If you have an older version of the toolbox, you can use randsample:
如果你有一个旧版本的工具箱,你可以使用randsample:
rndIdx = randsample(size(data,1),1000,true); % with replacement
subset = samples(rndIdx(1:1000), :);
rndIdx = randsample(size(data,1),1000,false); % without replacement
subset = samples(rndIdx(1:1000), :);
But using randsample is more or less the same as H.Muster's answer (which I have accepted as the best because it doesn't require any toolbox).
但使用randsample或多或少与H相同。集合的答案(我认为它是最好的,因为它不需要任何工具箱)。
Note: For more info on the difference between sampling with replacement vs. sampling without replacement, see this page.
注意:如果想要更多的信息,在替换的抽样和没有替换的抽样之间的区别,请参阅这一页。
#4
1
Not sure if you written any code so far. The following mathworks link shows examples of random sampling. Take a look at it for ideas.
目前还不确定是否编写了任何代码。下面的mathworks链接显示了随机抽样的例子。看看它的想法。
Also a code here with randsample from statistics toolbox. Just a logic and you may have to adjust it accordingly.
这里还有一个来自统计工具箱的randsample的代码。只是一个逻辑,你可能需要相应地调整它。
matrix m of N rows pull a random sample of n rows from m
矩阵m (N)行从m取N行随机抽样。
Sample = m(randsample(1:N,n),:)
示例= m(randsample(1:N,N):)
randsample(1:N,n)
randsample(1:N,N)
Above results in a sequence of n random integers from 1 to N.
上面的结果是一个n个随机整数序列,从1到n。