Suppose we have an array
假设我们有一个数组
A = zeros([1,10]);
We have several indexes with possible duplicate say:
我们有几个可能重复的索引:
indSeq = [1,1,2,3,4,4,4];
How can we increase A(i)
by the number of i
in the index sequence i.e. A(1) = 2, A(2) = 1, A(3) = 1, A(4) = 3
?
如何通过索引序列中的i增加A(i),即A(1) = 2, A(2) = 1, A(3) = 1, A(4) = 3?
The code A(indSeq) = A(indSeq)+1
does not work.
代码A(indSeq) = A(indSeq)+1不起作用。
I know that I can use the following for loop to achieve the goal, but I wonder if there is anyway that we can avoid for-loop? We can assume that the indSeq
is sorted.
我知道我可以使用下面的for循环来实现目标,但是我想知道我们是否可以避免for循环?我们可以假设indSeq是有序的。
A for-loop solution:
一个for循环的解决方案:
for i=1:length(indSeq)
A(indSeq(i)) = A(indSeq(i))+1;
end;
3 个解决方案
#1
3
You can use accumarray
for such a label based counting job, like so -
您可以使用accumarray为这样一个基于标签的计数工作,如so -
accumarray(indSeq(:),1)
Benchmarking
基准测试
As suggested in the other answer
, you can also use hist/histc
. Let's benchmark these two for a large datasize. The benchmarking code I used had -
如另一个答案所示,您还可以使用hist/histc。让我们为这两个数据进行基准测试。我使用的基准代码是
%// Create huge random array filled with ints that are duplicated & sorted
maxn = 100000;
N = 10000000;
indSeq = sort(randi(maxn,1,N));
disp('--------------------- With HISTC')
tic,histc(indSeq,unique(indSeq));toc
disp('--------------------- With ACCUMARRAY')
tic,accumarray(indSeq(:),1);toc
Runtime output -
运行时输出-
--------------------- With HISTC
Elapsed time is 1.028165 seconds.
--------------------- With ACCUMARRAY
Elapsed time is 0.220202 seconds.
#2
1
This is run-length encoding, and the following code should do the trick for you.
这是一个运行长度的编码,下面的代码应该为您完成这个技巧。
A=zeros(1,10);
indSeq = [1,1,2,3,4,4,4,7,1];
indSeq=sort(indSeq); %// if your input is always sorted, you don't need to do this
pos = [1; find(diff(indSeq(:)))+1; numel(indSeq)+1];
A(indSeq(pos(1:end-1)))=diff(pos)
which returns
它返回
A =
3 1 1 3 0 0 1 0 0 0
This algorithm was written by Luis Mendo for MATL.
这个算法是由Luis Mendo为MATL编写的。
#3
1
I think what you are looking for is the number of occurences of unique values of the array. This can be accomplished with:
我认为你要找的是数组中唯一值出现的次数。这可以通过以下方式完成:
[num, val] = hist(indSeq,unique(indSeq));
the output of your example is:
您的示例的输出是:
num = 2 1 1 3
val = 1 2 3 4
so num is the number of times val occurs. i.e. the number 1 occurs 2 times in your example
所以num是val发生的次数。例如,数字1在你的例子中出现了2次
#1
3
You can use accumarray
for such a label based counting job, like so -
您可以使用accumarray为这样一个基于标签的计数工作,如so -
accumarray(indSeq(:),1)
Benchmarking
基准测试
As suggested in the other answer
, you can also use hist/histc
. Let's benchmark these two for a large datasize. The benchmarking code I used had -
如另一个答案所示,您还可以使用hist/histc。让我们为这两个数据进行基准测试。我使用的基准代码是
%// Create huge random array filled with ints that are duplicated & sorted
maxn = 100000;
N = 10000000;
indSeq = sort(randi(maxn,1,N));
disp('--------------------- With HISTC')
tic,histc(indSeq,unique(indSeq));toc
disp('--------------------- With ACCUMARRAY')
tic,accumarray(indSeq(:),1);toc
Runtime output -
运行时输出-
--------------------- With HISTC
Elapsed time is 1.028165 seconds.
--------------------- With ACCUMARRAY
Elapsed time is 0.220202 seconds.
#2
1
This is run-length encoding, and the following code should do the trick for you.
这是一个运行长度的编码,下面的代码应该为您完成这个技巧。
A=zeros(1,10);
indSeq = [1,1,2,3,4,4,4,7,1];
indSeq=sort(indSeq); %// if your input is always sorted, you don't need to do this
pos = [1; find(diff(indSeq(:)))+1; numel(indSeq)+1];
A(indSeq(pos(1:end-1)))=diff(pos)
which returns
它返回
A =
3 1 1 3 0 0 1 0 0 0
This algorithm was written by Luis Mendo for MATL.
这个算法是由Luis Mendo为MATL编写的。
#3
1
I think what you are looking for is the number of occurences of unique values of the array. This can be accomplished with:
我认为你要找的是数组中唯一值出现的次数。这可以通过以下方式完成:
[num, val] = hist(indSeq,unique(indSeq));
the output of your example is:
您的示例的输出是:
num = 2 1 1 3
val = 1 2 3 4
so num is the number of times val occurs. i.e. the number 1 occurs 2 times in your example
所以num是val发生的次数。例如,数字1在你的例子中出现了2次