重复复制数组元素:在MATLAB中执行长度解码

I'm trying to insert multiple values into an array using a 'values' array and a 'counter' array. For example, if:

我尝试使用“values”数组和“counter”数组将多个值插入到数组中。例如,如果:

a=[1,3,2,5]
b=[2,2,1,3]

I want the output of some function

我想要某个函数的输出

c=somefunction(a,b)

to be

是

c=[1,1,3,3,2,5,5,5]

Where a(1) recurs b(1) number of times, a(2) recurs b(2) times, etc...

如果a(1)重复b(1)次，a(2)重复b(2)次，等等……

Is there a built-in function in MATLAB that does this? I'd like to avoid using a for loop if possible. I've tried variations of 'repmat()' and 'kron()' to no avail.

MATLAB中是否有一个内置函数来实现这一点?如果可能的话，我希望避免使用for循环。我试过各种各样的“repmat()”和“kron()”，但都没用。

This is basically Run-length encoding.

这基本上是运行时长度编码。

5 个解决方案

#1

Problem Statement

We have an array of values, vals and runlengths, runlens:

我们有一系列的值，瓦尔斯和运行长度，runlens:

vals     = [1,3,2,5]
runlens  = [2,2,1,3]

We are needed to repeat each element in vals times each corresponding element in runlens. Thus, the final output would be:

我们需要在vals里重复每个元素，乘以runlens里的每个相应元素。因此，最终的输出将是:

output = [1,1,3,3,2,5,5,5]

Prospective Approach

One of the fastest tools with MATLAB is cumsum and is very useful when dealing with vectorizing problems that work on irregular patterns. In the stated problem, the irregularity comes with the different elements in runlens.

使用MATLAB的最快工具之一是cumsum，在处理处理不规则模式下的矢量化问题时非常有用。在上述问题中，不规则性来自于runlens中的不同元素。

Now, to exploit cumsum, we need to do two things here: Initialize an array of zeros and place "appropriate" values at "key" positions over the zeros array, such that after "cumsum" is applied, we would end up with a final array of repeated vals of runlens times.

现在，为了利用cumsum，我们需要做两件事:初始化一个0数组，并在0数组的“key”位置上放置“合适的”值，这样在应用“cumsum”之后，我们将得到一个重复的运行时值的值的数组。

Steps: Let's number the above mentioned steps to give the prospective approach an easier perspective:

步骤:让我们为上面提到的步骤编号，以使前瞻性方法更容易观察:

1) Initialize zeros array: What must be the length? Since we are repeating runlens times, the length of the zeros array must be the summation of all runlens.

1)初始化0数组:长度必须是多少?因为我们在重复运行时时间，所以零数组的长度必须是所有运行时时间的总和。

2) Find key positions/indices: Now these key positions are places along the zeros array where each element from vals start to repeat. Thus, for runlens = [2,2,1,3], the key positions mapped onto the zeros array would be:

2)查找关键位置/索引:现在这些关键位置是沿着零数组的位置，在这里，每个元素从vals开始重复。因此，对于runlens =[2,2,1,3]，映射到零数组的关键位置为:

[X 0 X 0 X X 0 0], where X's are those key positions.

3) Find appropriate values: The final nail to be hammered before using cumsum would be to put "appropriate" values into those key positions. Now, since we would be doing cumsum soon after, if you think closely, you would need a differentiated version of values with diff, so that cumsum on those would bring back our values. Since these differentiated values would be placed on a zeros array at places separated by the runlens distances, after using cumsum we would have each vals element repeated runlens times as the final output.

3)找到合适的值:在使用cumsum之前，要把最后的钉子钉进去，然后把“合适”的值放入这些关键位置。现在，因为我们很快就要做cumsum了，如果你仔细想想，你需要用diff来区分不同的值，这样的话，cumsum就会带回来我们的值。由于这些不同的值将被放置在由runlens距离所分隔的位置上的零阵列上，在使用了cumsum之后，我们将会让每个vals元素重复runlens时间作为最终输出。

Solution Code

解决方案的代码

Here's the implementation stitching up all the above mentioned steps -

以下是将上述所有步骤缝合在一起的实现

%// Calculate cumsumed values of runLengths. 
%// We would need this to initialize zeros array and find key positions later on.
clens = cumsum(runlens)

%// Initalize zeros array
array = zeros(1,(clens(end)))

%// Find key positions/indices
key_pos = [1 clens(1:end-1)+1]

%// Find appropriate values
app_vals = diff([0 vals])

%// Map app_values at key_pos on array
array(pos) = app_vals

%// cumsum array for final output
output = cumsum(array)

Pre-allocation Hack

预先分配黑客

As could be seen that the above listed code uses pre-allocation with zeros. Now, according to this UNDOCUMENTED MATLAB blog on faster pre-allocation, one can achieve much faster pre-allocation with -

可以看出，上面列出的代码使用0进行预分配。现在，根据这个无证MATLAB博客关于更快的预分配，可以实现更快的预分配

`array(clens(end)) = 0` instead of `array = zeros(1,(clens(end)))`

Wrapping up: Function Code

包装:函数的代码

To wrap up everything, we would have a compact function code to achieve this run-length decoding like so -

为了总结所有内容，我们将有一个紧凑的函数代码来实现这样的运行长度解码

function out = rle_cumsum_diff(vals,runlens)
clens = cumsum(runlens);
idx(clens(end))=0;
idx([1 clens(1:end-1)+1]) = diff([0 vals]);
out = cumsum(idx);
return;

Benchmarking

Benchmarking Code

基准测试代码

Listed next is the benchmarking code to compare runtimes and speedups for the stated cumsum+diff approach in this post over the other cumsum-only based approach on MATLAB 2014B-

下面列出的基准代码用于比较本文中所述的cumsum+diff方法的运行时间和加速速度，而不是基于MATLAB 2014B-的其他基于cumsumonly的方法

datasizes = [reshape(linspace(10,70,4).'*10.^(0:4),1,[]) 10^6 2*10^6]; %//'
fcns = {'rld_cumsum','rld_cumsum_diff'}; %// approaches to be benchmarked

for k1 = 1:numel(datasizes)
    n = datasizes(k1); %// Create random inputs
    vals = randi(200,1,n);
    runs = [5000 randi(200,1,n-1)]; %// 5000 acts as an aberration
    for k2 = 1:numel(fcns) %// Time approaches  
        tsec(k2,k1) = timeit(@() feval(fcns{k2}, vals,runs), 1);
    end
end

figure,      %// Plot runtimes
loglog(datasizes,tsec(1,:),'-bo'), hold on
loglog(datasizes,tsec(2,:),'-k+')
set(gca,'xgrid','on'),set(gca,'ygrid','on'),
xlabel('Datasize ->'), ylabel('Runtimes (s)')
legend(upper(strrep(fcns,'_',' '))),title('Runtime Plot')

figure,      %// Plot speedups
semilogx(datasizes,tsec(1,:)./tsec(2,:),'-rx')        
set(gca,'ygrid','on'), xlabel('Datasize ->')
legend('Speedup(x) with cumsum+diff over cumsum-only'),title('Speedup Plot')

Associated function code for rld_cumsum.m:

rld_cumsum.m的关联函数代码:

function out = rld_cumsum(vals,runlens)
index = zeros(1,sum(runlens));
index([1 cumsum(runlens(1:end-1))+1]) = 1;
out = vals(cumsum(index));
return;

Runtime and Speedup Plots

运行时和加速情节

重复复制数组元素:在MATLAB中执行长度解码

Conclusions

The proposed approach seems to be giving us a noticeable speedup over the cumsum-only approach, which is about 3x!

提议的方法似乎给了我们一个明显的加速比只有累加的方法，大约是3倍!

Why is this new cumsum+diff based approach better than the previous cumsum-only approach?

为什么这种新的基于cumsum+diff的方法比以前的仅包含cumsum的方法更好呢?

Well, the essence of the reason lies at the final step of the cumsum-only approach that needs to map the "cumsumed" values into vals. In the new cumsum+diff based approach, we are doing diff(vals) instead for which MATLAB is processing only n elements (where n is the number of runLengths) as compared to the mapping of sum(runLengths) number of elements for the cumsum-only approach and this number must be many times more than n and therefore the noticeable speedup with this new approach!

好吧，原因的本质在于，只需要将“cumsumed”值映射到vals中的最后一步。在新的基于cumsum +差异的方法中,我们在做diff(val)相反的MATLAB处理只有n个元素(其中n是runLengths)相比的映射(runLengths)的元素数量总和cumsum-only方法,这个数字肯定比n倍,因此这种新方法的明显加速!

#2

Benchmarks

基准

Updated for R2015b: repelem now fastest for all data sizes.

更新为R2015b: repelem现在为所有数据大小最快。

Tested functions:

测试功能:

MATLAB's built-in repelem function that was added in R2015a
MATLAB内置的repelem函数，在R2015a中添加
gnovice's cumsum solution (rld_cumsum)
gnovice cumsum的解决方案(rld_cumsum)
Divakar's cumsum+diff solution (rld_cumsum_diff)
Divakar cumsum + diff解决方案(rld_cumsum_diff)
knedlsepp's accumarray solution (knedlsepp5cumsumaccumarray) from this post
knedlsepp的accumarray溶液(knedlsepp5cumaccumarray)来自这个帖子
Naive loop-based implementation (naive_jit_test.m) to test the just-in-time compiler
幼稚的基于循环的实现(naive_jit_test.m)来测试即时编译器

Results of test_rld.m on R2015b:

test_rld的结果。m R2015b:

Old timing plot using R2015a here.

使用R2015a的老计时图。

Findings:

发现:

repelem is always the fastest by roughly a factor of 2.
repelem总是速度最快的，大约是2倍。
rld_cumsum_diff is consistently faster than rld_cumsum.
rld_cumsum_diff始终比rld_cumsum快。
~~repelem is fastest for small data sizes (less than about 300-500 elements)~~
对于小数据大小(小于300-500个元素)，repelem是最快的
~~rld_cumsum_diff becomes significantly faster than repelem around 5 000 elements~~
rld_cumsum_diff比repelem快得多，大约5000个元素
~~repelem becomes slower than rld_cumsum somewhere between 30 000 and 300 000 elements~~
repelem比rld_cumsum的速度慢，大约在30000到300000个元素之间
rld_cumsum has roughly the same performance as knedlsepp5cumsumaccumarray
rld_cumsum与knedlsepp5cumsumaccumarray的性能大致相同。
naive_jit_test.m has nearly constant speed and on par with rld_cumsum and knedlsepp5cumsumaccumarray for smaller sizes, a little faster for large sizes
naive_jit_test。m的速度几乎是恒定的，与rld_cumsum和knedlsepp5cumacmarray在更小的尺寸上相同，在大的尺寸上更快一点

Old rate plot using R2015a here.

使用R2015a的老费率图。

Conclusion

结论

Use repelem ~~below about 5 000 elements and the cumsum+diff solution above~~.

在5000个元素下面使用repelem和上面的cumsum+diff解决方案。

#3

There's no built-in function I know of, but here's one solution:

我知道没有内置功能，但有一个解决方案:

index = zeros(1,sum(b));
index([1 cumsum(b(1:end-1))+1]) = 1;
c = a(cumsum(index));

Explanation:

A vector of zeroes is first created of the same length as the output array (i.e. the sum of all the replications in b). Ones are then placed in the first element and each subsequent element representing where the start of a new sequence of values will be in the output. The cumulative sum of the vector index can then be used to index into a, replicating each value the desired number of times.

零的向量是第一次创建相同的长度作为输出数组(即b)的复制的总和。然后放在第一个元素和每个后续元素代表的开始一个新的序列的值将被输出。然后，可以使用向量索引的累积和将每个值复制到所需的次数。

For the sake of clarity, this is what the various vectors look like for the values of a and b given in the question:

为了清晰起见，这就是问题中给出的a和b的不同向量的样子:

        index = [1 0 1 0 1 1 0 0]
cumsum(index) = [1 1 2 2 3 4 4 4]
            c = [1 1 3 3 2 5 5 5]

EDIT: For the sake of completeness, there is another alternative using ARRAYFUN, but this seems to take anywhere from 20-100 times longer to run than the above solution with vectors up to 10,000 elements long:

编辑:为了完整性起见，还有另一个使用ARRAYFUN的替代方案，但这似乎要比上面的解决方案花费的时间要长20-100倍，最多可达10,000个元素:

c = arrayfun(@(x,y) x.*ones(1,y),a,b,'UniformOutput',false);
c = [c{:}];

#4

There is finally (as of R2015a) a built-in and documented function to do this, repelem. The following syntax, where the second argument is a vector, is relevant here:

最后(从R2015a开始)有一个内置的和文档化的函数来完成这个任务，repelem。下面的语法(第二个参数是向量)在这里是相关的:

W = repelem(V,N), with vector V and vector N, creates a vector W where element V(i) is repeated N(i) times.

W = repelem(V,N)，用向量V和向量N创建一个向量W，其中元素V(i)重复N(i)次。

Or put another way, "Each element of N specifies the number of times to repeat the corresponding element of V."

换句话说，“N的每个元素都指定重复相应元素的次数。”

Example:

例子:

>> a=[1,3,2,5]
a =
     1     3     2     5
>> b=[2,2,1,3]
b =
     2     2     1     3
>> repelem(a,b)
ans =
     1     1     3     3     2     5     5     5

#5

The performance problems in MATLAB's built-in repelem have been fixed as of R2015b. I have run the test_rld.m program from chappjc's post in R2015b, and repelem is now faster than other algorithms by about a factor 2:

MATLAB内置的repelem的性能问题已于R2015b修复。我已经运行了test_rld。m程序来自chappjc在R2015b的岗位，repelem现在比其他算法快了大约2倍:

重复复制数组元素:在MATLAB中执行长度解码

#1

Problem Statement

We have an array of values, vals and runlengths, runlens:

我们有一系列的值，瓦尔斯和运行长度，runlens:

vals     = [1,3,2,5]
runlens  = [2,2,1,3]

We are needed to repeat each element in vals times each corresponding element in runlens. Thus, the final output would be:

我们需要在vals里重复每个元素，乘以runlens里的每个相应元素。因此，最终的输出将是:

output = [1,1,3,3,2,5,5,5]

Prospective Approach

使用MATLAB的最快工具之一是cumsum，在处理处理不规则模式下的矢量化问题时非常有用。在上述问题中，不规则性来自于runlens中的不同元素。

Steps: Let's number the above mentioned steps to give the prospective approach an easier perspective:

步骤:让我们为上面提到的步骤编号，以使前瞻性方法更容易观察:

1) Initialize zeros array: What must be the length? Since we are repeating runlens times, the length of the zeros array must be the summation of all runlens.

1)初始化0数组:长度必须是多少?因为我们在重复运行时时间，所以零数组的长度必须是所有运行时时间的总和。

[X 0 X 0 X X 0 0], where X's are those key positions.

Solution Code

解决方案的代码

Here's the implementation stitching up all the above mentioned steps -

以下是将上述所有步骤缝合在一起的实现

%// Calculate cumsumed values of runLengths. 
%// We would need this to initialize zeros array and find key positions later on.
clens = cumsum(runlens)

%// Initalize zeros array
array = zeros(1,(clens(end)))

%// Find key positions/indices
key_pos = [1 clens(1:end-1)+1]

%// Find appropriate values
app_vals = diff([0 vals])

%// Map app_values at key_pos on array
array(pos) = app_vals

%// cumsum array for final output
output = cumsum(array)

Pre-allocation Hack

预先分配黑客

可以看出，上面列出的代码使用0进行预分配。现在，根据这个无证MATLAB博客关于更快的预分配，可以实现更快的预分配

`array(clens(end)) = 0` instead of `array = zeros(1,(clens(end)))`

Wrapping up: Function Code

包装:函数的代码

To wrap up everything, we would have a compact function code to achieve this run-length decoding like so -

为了总结所有内容，我们将有一个紧凑的函数代码来实现这样的运行长度解码

function out = rle_cumsum_diff(vals,runlens)
clens = cumsum(runlens);
idx(clens(end))=0;
idx([1 clens(1:end-1)+1]) = diff([0 vals]);
out = cumsum(idx);
return;

Benchmarking

Benchmarking Code

基准测试代码

Listed next is the benchmarking code to compare runtimes and speedups for the stated cumsum+diff approach in this post over the other cumsum-only based approach on MATLAB 2014B-

下面列出的基准代码用于比较本文中所述的cumsum+diff方法的运行时间和加速速度，而不是基于MATLAB 2014B-的其他基于cumsumonly的方法

datasizes = [reshape(linspace(10,70,4).'*10.^(0:4),1,[]) 10^6 2*10^6]; %//'
fcns = {'rld_cumsum','rld_cumsum_diff'}; %// approaches to be benchmarked

for k1 = 1:numel(datasizes)
    n = datasizes(k1); %// Create random inputs
    vals = randi(200,1,n);
    runs = [5000 randi(200,1,n-1)]; %// 5000 acts as an aberration
    for k2 = 1:numel(fcns) %// Time approaches  
        tsec(k2,k1) = timeit(@() feval(fcns{k2}, vals,runs), 1);
    end
end

figure,      %// Plot runtimes
loglog(datasizes,tsec(1,:),'-bo'), hold on
loglog(datasizes,tsec(2,:),'-k+')
set(gca,'xgrid','on'),set(gca,'ygrid','on'),
xlabel('Datasize ->'), ylabel('Runtimes (s)')
legend(upper(strrep(fcns,'_',' '))),title('Runtime Plot')

figure,      %// Plot speedups
semilogx(datasizes,tsec(1,:)./tsec(2,:),'-rx')        
set(gca,'ygrid','on'), xlabel('Datasize ->')
legend('Speedup(x) with cumsum+diff over cumsum-only'),title('Speedup Plot')

Associated function code for rld_cumsum.m:

rld_cumsum.m的关联函数代码:

function out = rld_cumsum(vals,runlens)
index = zeros(1,sum(runlens));
index([1 cumsum(runlens(1:end-1))+1]) = 1;
out = vals(cumsum(index));
return;

Runtime and Speedup Plots

运行时和加速情节

重复复制数组元素:在MATLAB中执行长度解码

Conclusions

The proposed approach seems to be giving us a noticeable speedup over the cumsum-only approach, which is about 3x!

提议的方法似乎给了我们一个明显的加速比只有累加的方法，大约是3倍!

Why is this new cumsum+diff based approach better than the previous cumsum-only approach?

为什么这种新的基于cumsum+diff的方法比以前的仅包含cumsum的方法更好呢?

#2

Benchmarks

基准

Updated for R2015b: repelem now fastest for all data sizes.

更新为R2015b: repelem现在为所有数据大小最快。

Tested functions:

测试功能:

MATLAB's built-in repelem function that was added in R2015a
MATLAB内置的repelem函数，在R2015a中添加
gnovice's cumsum solution (rld_cumsum)
gnovice cumsum的解决方案(rld_cumsum)
Divakar's cumsum+diff solution (rld_cumsum_diff)
Divakar cumsum + diff解决方案(rld_cumsum_diff)
knedlsepp's accumarray solution (knedlsepp5cumsumaccumarray) from this post
knedlsepp的accumarray溶液(knedlsepp5cumaccumarray)来自这个帖子
Naive loop-based implementation (naive_jit_test.m) to test the just-in-time compiler
幼稚的基于循环的实现(naive_jit_test.m)来测试即时编译器

Results of test_rld.m on R2015b:

test_rld的结果。m R2015b:

Old timing plot using R2015a here.

使用R2015a的老计时图。

Findings:

发现:

repelem is always the fastest by roughly a factor of 2.
repelem总是速度最快的，大约是2倍。
rld_cumsum_diff is consistently faster than rld_cumsum.
rld_cumsum_diff始终比rld_cumsum快。
~~repelem is fastest for small data sizes (less than about 300-500 elements)~~
对于小数据大小(小于300-500个元素)，repelem是最快的
~~rld_cumsum_diff becomes significantly faster than repelem around 5 000 elements~~
rld_cumsum_diff比repelem快得多，大约5000个元素
~~repelem becomes slower than rld_cumsum somewhere between 30 000 and 300 000 elements~~
repelem比rld_cumsum的速度慢，大约在30000到300000个元素之间
rld_cumsum has roughly the same performance as knedlsepp5cumsumaccumarray
rld_cumsum与knedlsepp5cumsumaccumarray的性能大致相同。
naive_jit_test.m has nearly constant speed and on par with rld_cumsum and knedlsepp5cumsumaccumarray for smaller sizes, a little faster for large sizes
naive_jit_test。m的速度几乎是恒定的，与rld_cumsum和knedlsepp5cumacmarray在更小的尺寸上相同，在大的尺寸上更快一点

Old rate plot using R2015a here.

使用R2015a的老费率图。

Conclusion

结论

Use repelem ~~below about 5 000 elements and the cumsum+diff solution above~~.

在5000个元素下面使用repelem和上面的cumsum+diff解决方案。

#3

There's no built-in function I know of, but here's one solution:

我知道没有内置功能，但有一个解决方案:

index = zeros(1,sum(b));
index([1 cumsum(b(1:end-1))+1]) = 1;
c = a(cumsum(index));

Explanation:

For the sake of clarity, this is what the various vectors look like for the values of a and b given in the question:

为了清晰起见，这就是问题中给出的a和b的不同向量的样子:

        index = [1 0 1 0 1 1 0 0]
cumsum(index) = [1 1 2 2 3 4 4 4]
            c = [1 1 3 3 2 5 5 5]

编辑:为了完整性起见，还有另一个使用ARRAYFUN的替代方案，但这似乎要比上面的解决方案花费的时间要长20-100倍，最多可达10,000个元素:

c = arrayfun(@(x,y) x.*ones(1,y),a,b,'UniformOutput',false);
c = [c{:}];

#4

There is finally (as of R2015a) a built-in and documented function to do this, repelem. The following syntax, where the second argument is a vector, is relevant here:

最后(从R2015a开始)有一个内置的和文档化的函数来完成这个任务，repelem。下面的语法(第二个参数是向量)在这里是相关的:

W = repelem(V,N), with vector V and vector N, creates a vector W where element V(i) is repeated N(i) times.

W = repelem(V,N)，用向量V和向量N创建一个向量W，其中元素V(i)重复N(i)次。

Or put another way, "Each element of N specifies the number of times to repeat the corresponding element of V."

换句话说，“N的每个元素都指定重复相应元素的次数。”

Example:

例子:

>> a=[1,3,2,5]
a =
     1     3     2     5
>> b=[2,2,1,3]
b =
     2     2     1     3
>> repelem(a,b)
ans =
     1     1     3     3     2     5     5     5

#5

MATLAB内置的repelem的性能问题已于R2015b修复。我已经运行了test_rld。m程序来自chappjc在R2015b的岗位，repelem现在比其他算法快了大约2倍:

重复复制数组元素:在MATLAB中执行长度解码

秒客网

重复复制数组元素:在MATLAB中执行长度解码

5 个解决方案

#1

Problem Statement

Prospective Approach

Benchmarking

Conclusions

#2

#3

Explanation:

#4

#5

#1

Problem Statement

Prospective Approach

Benchmarking

Conclusions

#2

#3

Explanation:

#4

#5

相关文章