MATLAB中的直方图交叉核优化。

I want to try a svm classifier using histogram intersection kernel, for a dataset of 153 images but it takes a long time. This is my code:

我想尝试一个使用直方图交叉核的svm分类器，对于153个图像的数据集，但是需要很长时间。这是我的代码:

a = load('...'); %vectors
b = load('...'); %labels
g = dataset(a,b);

error = crossval(g,libsvc([],proxm([],'ih'),100),10,10);
error1 = crossval(g,libsvc([],proxm([],'ih'),10),10,10);
error2 = crossval(g,libsvc([],proxm([],'ih'),1),10,10);

My implementation of the kernel within the proxm function is:

我在代理函数中的内核的实现是:

...
case {'dist_histint','ih'}
    [m,d]=size(A);
    [n,d1]=size(B);
    if (d ~= d1)
        error('column length of A (%d) != column length of B (%d)\n',d,d1);
    end

    % With the MATLAB JIT compiler the trivial implementation turns out
    % to be the fastest, especially for large matrices.
    D = zeros(m,n);
    for i=1:m % m is number of samples of A 
        if (0==mod(i,1000)) fprintf('.'); end
        for j=1:n % n is number of samples of B
            D(i,j) = sum(min([A(i,:);B(j,:)]));%./max(A(:,i),B(:,j)));
        end            
    end

I need some matlab optimization for this code!

我需要一些matlab优化的代码!

1 个解决方案

#1

You can get rid of that kernel loop to calculate D with this bsxfun based vectorized approach -

您可以使用基于bsxfun的矢量化方法来处理这个内核循环。

D = squeeze(sum(bsxfun(@min,A,permute(B,[3 2 1])),2))

Or avoid squeeze with this modification -

或者避免使用这种修改。

D = sum(bsxfun(@min,permute(A,[1 3 2]),permute(B,[3 1 2])),3)

If the calculations of D involve max instead of min, just replace @min with @max there.

如果计算D涉及的是max而不是min，只需用@max替换@min。

Explanation: The way bsxfun works is that it does expansion on singleton dimensions and performs the operation as listed with @ inside its call. Now, this expansion is basically how one achieves vectorized solutions that replace for-loops. By singleton dimensions in arrays, we mean dimensions of 1 in them.

说明:bsxfun的工作方式是在单例维度上进行扩展，并在其调用中执行在@中列出的操作。现在，这个扩展基本上是如何实现一个矢量化的解决方案来替换for循环。在数组中的单例维度中，我们指的是其中1的维数。

In many cases, singleton dimensions aren't already present and for vectorization with bsxfun, we need to create singleton dimensions. One of the tools to do so is with permute. That's basically all about the way vectorized approach stated earlier would work.

在许多情况下，单例维度并不是已经存在的，对于bsxfun的矢量化，我们需要创建单例维度。这样做的工具之一是使用permute。这基本上就是前面所述的矢量化方法的工作原理。

Thus, your kernel code -

因此，您的内核代码——。

...
case {'dist_histint','ih'}
    [m,d]=size(A);
    [n,d1]=size(B);
    if (d ~= d1)
        error('column length of A (%d) != column length of B (%d)\n',d,d1);
    end

    % With the MATLAB JIT compiler the trivial implementation turns out
    % to be the fastest, especially for large matrices.
    D = zeros(m,n);
    for i=1:m % m is number of samples of A 
        if (0==mod(i,1000)) fprintf('.'); end
        for j=1:n % n is number of samples of B
            D(i,j) = sum(min([A(i,:);B(j,:)]));%./max(A(:,i),B(:,j)));
        end            
    end

reduces to -

减少了-

...
case {'dist_histint','ih'}
    [m,d]=size(A);
    [n,d1]=size(B);
    if (d ~= d1)
        error('column length of A (%d) != column length of B (%d)\n',d,d1);
    end
    D = squeeze(sum(bsxfun(@min,A,permute(B,[3 2 1])),2))
    %// OR D = sum(bsxfun(@min,permute(A,[1 3 2]),permute(B,[3 1 2])),3)

I am assuming the line: if (0==mod(i,1000)) fprintf('.'); end isn't important to the calculations as it does printing of some message.

我假设这条线:if (0==mod(I,1000)) fprintf('.');结束对计算并不重要，因为它确实打印了一些消息。

#1