这篇博客主要把SIFT通过MATLAB进行可视化,真正的SIFT算法准备在下一篇博客(如果我能找到并看懂的话)详细说明,也就是下面提到的siftWin32.exe,作者把SIFT给写成.exe了。
个别名词解释:
key:SIFT算法识别出的每个image中的关键点。
首先,执行[image, descriptors, locs] = sift(imageFile),求出每个key的长度为128的descriptor,长度为4的locs(row, column, scale, orientation)。
使用作者生成的siftWin32.exe来进行SIFT计算,生成tmp.key文件。本人理解生成的tmp.key文件是这样的:
sift.m
% [image, descriptors, locs] = sift(imageFile)
%
% This function reads an image and returns its SIFT keypoints.
% Input parameters:
% imageFile: the file name for the image.
%
% Returned:
% image: the image array in double format
% ******每个描述符是个长度为128的变量——128=8*4*4,8代表方向,4*4表示一共取了16个子图。******
% ******descriptors是一个k*128的矩阵,每一行都代表一个descriptor的信息。******
% descriptors: a K-by-128 matrix, where each row gives an invariant
% descriptor for one of the K keypoints. The descriptor is a vector
% of 128 values normalized to unit length.
% ******这应该就是记录描述符比例、位置和方向的的location信息的元素******
% ******locs是一个k*4的矩阵******
% locs: K-by-4 matrix, in which each row has the 4 values for a
% keypoint location (row, column, scale, orientation). The
% orientation is in the range [-PI, PI] radians.
%
% Credits: Thanks for initial version of this program to D. Alvaro and
% J.J. Guerrero, Universidad de Zaragoza (modified by D. Lowe)
function [image, descriptors, locs] = sift(imageFile)
% Load image
image = imread(imageFile);
% If you have the Image Processing Toolbox, you can uncomment the following
% lines to allow input of color images, which will be converted to grayscale.
% if isrgb(image)
% image = rgb2gray(image);
% end
[rows, cols] = size(image);
% Convert into PGM imagefile, readable by "keypoints" executable
f = fopen('tmp.pgm', 'w');
if f == -1
error('Could not create file tmp.pgm.');
end
fprintf(f, 'P5\n%d\n%d\n255\n', cols, rows);
fwrite(f, image', 'uint8');
fclose(f);
% Call keypoints executable
if isunix
command = '!./sift ';
else
% ************在windows下执行siftWin32.exe来产生tmp.key************
% ************这里才是关键,下一篇我要把作者的C的版本的研究下,这里先把大概过程演示出来************
command = '!siftWin32 ';
end
% ************这个tmp.key就是用来存SIFT的计算结果的,然后用MATLAB来可视化************
command = [command ' <tmp.pgm >tmp.key'];
eval(command);
% Open tmp.key and check its header
g = fopen('tmp.key', 'r');
if g == -1
error('Could not open file tmp.key.');
end
%************建立一个数组来存放头信息header************
[header, count] = fscanf(g, '%d %d', [1 2]);
if count ~= 2
error('Invalid keypoint file beginning.');
end
%************num用来存key个数,len则是表示descriptor 的长度************
num = header(1);
len = header(2);
if len ~= 128
error('Keypoint descriptor length invalid (should be 128).');
end
% Creates the two output matrices (use known size for efficiency)
locs = double(zeros(num, 4));
descriptors = double(zeros(num, 128));
% Parse tmp.key
for i = 1:num
[vector, count] = fscanf(g, '%f %f %f %f', [1 4]); %row col scale ori
if count ~= 4
error('Invalid keypoint file format');
end
locs(i, :) = vector(1, :);
[descrip, count] = fscanf(g, '%d', [1 len]);
if (count ~= 128)
error('Invalid keypoint file value.');
end
% Normalize each input vector to unit length
descrip = descrip / sqrt(sum(descrip.^2));
descriptors(i, :) = descrip(1, :);
end
fclose(g);
可以先利用sift生成的值来进行keys的显示:showkeys(img,locs),先有一个直观的感受。
showkeys
% showkeys(image, locs)
%
% This function displays an image with SIFT keypoints overlayed.
% Input parameters:
% image: the file name for the image (grayscale)
% locs: matrix in which each row gives a keypoint location (row,
% column, scale, orientation)
function showkeys(image, locs)
disp('Drawing SIFT keypoints ...');
% Draw image with keypoints
figure('Position', [50 50 size(image,2) size(image,1)]);
colormap('gray');
imagesc(image);
hold on;
imsize = size(image);
for i = 1: size(locs,1)
% Draw an arrow, each line transformed according to keypoint parameters.
TransformLine(imsize, locs(i,:), 0.0, 0.0, 1.0, 0.0);
TransformLine(imsize, locs(i,:), 0.85, 0.1, 1.0, 0.0);
TransformLine(imsize, locs(i,:), 0.85, -0.1, 1.0, 0.0);
end
hold off;
% ------ Subroutine: TransformLine -------
% Draw the given line in the image, but first translate, rotate, and
% scale according to the keypoint parameters.
%
% Parameters:
% Arrays:
% imsize = [rows columns] of image
% keypoint = [subpixel_row subpixel_column scale orientation]
%
% Scalars:
% x1, y1; begining of vector
% x2, y2; ending of vector
function TransformLine(imsize, keypoint, x1, y1, x2, y2)
% The scaling of the unit length arrow is set to approximately the radius
% of the region used to compute the keypoint descriptor.
len = 6 * keypoint(3);
% Rotate the keypoints by 'ori' = keypoint(4)
s = sin(keypoint(4));
c = cos(keypoint(4));
% Apply transform
r1 = keypoint(1) - len * (c * y1 + s * x1);
c1 = keypoint(2) + len * (- s * y1 + c * x1);
r2 = keypoint(1) - len * (c * y2 + s * x2);
c2 = keypoint(2) + len * (- s * y2 + c * x2);
line([c1 c2], [r1 r2], 'Color', 'c');
利用sift产生的信息(descriptors, locs)来进行keys的匹配。
match.m
% num = match(image1, image2)
%
% This function reads two images, finds their SIFT features, and
% displays lines connecting the matched keypoints. A match is accepted
% only if its distance is less than distRatio times the distance to the
% second closest match.
% It returns the number of matches displayed.
%
% Example: match('scene.pgm','book.pgm');
function num = match(image1, image2)
% Find SIFT keypoints for each image
[im1, des1, loc1] = sift(image1);
[im2, des2, loc2] = sift(image2);
% For efficiency in Matlab, it is cheaper to compute dot products between
% unit vectors rather than Euclidean distances. Note that the ratio of
% angles (acos of dot products of unit vectors) is a close approximation
% to the ratio of Euclidean distances for small angles.
%
% distRatio: Only keep matches in which the ratio of vector angles from the
% nearest to second nearest neighbor is less than distRatio.
distRatio = 0.6;
% For each descriptor in the first image, select its match to second image.
des2t = des2'; % Precompute matrix transpose
for i = 1 : size(des1,1)
dotprods = des1(i,:) * des2t; % Computes vector of dot products
[vals,indx] = sort(acos(dotprods)); % Take inverse cosine and sort results
% Check if nearest neighbor has angle less than distRatio times 2nd.
if (vals(1) < distRatio * vals(2))
match(i) = indx(1);
else
match(i) = 0;
end
end
% Create a new image showing the two images side by side.
im3 = appendimages(im1,im2);
% Show a figure with lines joining the accepted matches.
figure('Position', [100 100 size(im3,2) size(im3,1)]);
colormap('gray');
imagesc(im3);
hold on;
cols1 = size(im1,2);
for i = 1: size(des1,1)
if (match(i) > 0)
line([loc1(i,2) loc2(match(i),2)+cols1], ...
[loc1(i,1) loc2(match(i),1)], 'Color', 'c');
end
end
hold off;
num = sum(match > 0);
fprintf('Found %d matches.\n', num);
接下来准备研究一下SIFT真正的实现算法,把这个SIFT算法真正搞懂,没准能有更多收获。
最后,总结点心得以提示自己:
论文里面很多不懂的点,通过看源码,实现源码,清晰了。能实现最好还是实现,自己也能有兴趣继续做下去。