OpenCV人脸识别Eigen算法源码分析

时间:2024-01-19 09:20:02

1 理论基础

学习Eigen人脸识别算法需要了解一下它用到的几个理论基础,现总结如下:

1.1 协方差矩阵

首先需要了解一下公式:

OpenCV人脸识别Eigen算法源码分析

共公式可以看出:均值描述的是样本集合的平均值,而标准差描述的则是样本集合的各个样本点到均值的距离之平均。以一个国家国民收入为例,均值反映了平均收入,而均方差/方差则反映了贫富差距,如果两个国家国民收入均值相等,则标准差越大说明国家的国民收入越不均衡,贫富差距较大。以上公式都是用来描述一维数据量的,把方差公式推广到二维,则可得到协方差公式:

OpenCV人脸识别Eigen算法源码分析

协方差表明了两个随机变量之间的相关性,值为正说明两者是正相关的,值为负说明两者是负相关的,值为零说明两者不相关,举一个简单的小例子,假设一个人用4个维度身高、体重、距离屋顶的高度、每天画画的时间来表示:身高取样X=[1 2 3 4 5 6 7 8 9],体重取样Y=[11 12 13 14 15 16 17 18 19],距离屋顶的高度取样Z=[9 8 7 6 5 4 3 2 1],每天画画时间L=[1 1 1 1 1 1 1 1 1],则有cov(X,Y)=7.5,cov(X,Z)=-7.5,cov(X,L)=0,结果很明显X和Y协方差为正数两者正相关,X和Z协方差为负数两者负相关,X和L协方差为0,说明它们不相关。以上例子每一个随机变量都可以表示一个维度,我们计算了部分维度之间的协方差,计算所有维度之间的协方差并组织成矩阵的形式,就有了协方差矩阵的概念:Cnxn=[ci,j]=[cov(Dimi,Dimj)]   i,j=1,2,…,n,Dimi表示第i个维度向量。以Matlab协方差矩阵为例,将X,Y,Z,L分别作为1,2,3,4个维度,则有c1,1=7.5,c1,2=7.5,c1,3=-7.5,c1,4=7.5……,所以协方差矩阵为:

OpenCV人脸识别Eigen算法源码分析

在Matlab中可以把矩阵的每行看做是4个随机变量的一组取样样本,每列看做是一个维度,则可以直接用con函数求得4个维度的协方差矩阵:

OpenCV人脸识别Eigen算法源码分析

1.2 Jacobi迭代法求对称矩阵特征向量及特征值

雅可比迭代法的基本思想是:通过一组平面旋转变换(相似正交变换)化对称矩阵A为对角矩阵,进而求出A的特征值与特征向量。由线性代数理论可知:若矩阵A是实对称矩阵,则一定存在正交矩阵U,使得UT*A*U=D,其中D对角矩阵,其主对角线元素λi是A的特征值,正交矩阵U的第i列是A对应特征值λi的特征向量。于是求对称矩阵A的特征值问题转化为寻找正交矩阵U,使得UT*A*U为对角矩阵,这个问题的困难在于如何构造U,为此我们先看一下平面上的旋转变换:

OpenCV人脸识别Eigen算法源码分析

则有:

OpenCV人脸识别Eigen算法源码分析

其中:

OpenCV人脸识别Eigen算法源码分析

OpenCV人脸识别Eigen算法源码分析

上述推导其实说明了一种构造正交矩阵P,并使得PT*A*P为对角矩阵的方法,可以将这种方法推广到nxn对角矩阵,首先引入n阶旋转矩阵(Givens矩阵)的概念:

OpenCV人脸识别Eigen算法源码分析

平面旋转矩阵有如下性质:

(1)Upq为正交矩阵,即UpqT*Upq=E

(2)UTAU=B仍为对称矩阵,且B与A有相同的特征值

Jacobi迭代法,在每一次迭代时都是进行一次(2)中的转换,这里p、q分别是前一次的迭代矩阵A的非主对角线上绝对值最大元素的行列号,变换后元素值可以由以下公式求出:

OpenCV人脸识别Eigen算法源码分析

由公式可以看出转换后矩阵相比原矩阵只是在p,q行和列的元素发生了改变,旋转角的计算过程和2维时一样,其意义是使得apq和aqp值为零,这样每次迭代都使得非对角线上绝对值最大的元素变为零,所以整个迭代的过程就是使对角线外元素逐步逼近于零,这是对角线上的元素即为原对称矩阵的特征值λi。在进行Jacobi迭代时,假如i次迭代时旋转矩阵为Ui,每次迭代对单位矩阵I依次左乘Ui,最终迭代结束后可得矩阵D=Uk…U2U1I,这里k为迭代次数,则可以证明D的列向量即为特征值λi对应的特征向量,证明如下:

OpenCV人脸识别Eigen算法源码分析

上述推导过程中di为矩阵D的i列表示的列向量,由最后的等式及特征值定义,可以得知λi是A的特征值,di为对应的特征向量。

2 OpenCV源码解析

2.1 关键函数

(1)void reduce(InputArray src, OutputArray dst, int dim, int rtype, int dtype=-1)

其英文注释:transforms 2D matrix to 1D row or column vector by taking sum, minimum, maximum or mean value over all the rows.

其英文注释不太准确,函数的作用其实是:将2维矩阵转换为1维行向量或列向量,如转换为行向量,则每列处的值为原矩阵对应列所有值的和,最小值,最大值,平均值;如转换为列向量,则每行处的值为原矩阵对应行所有值的和。该函数参数意义如下:

src: 原矩阵

dst: 目的向量

dim: 指明处理后向量是行向量还是列向量,0原矩阵被处理成行向量,否则原矩阵被处理成列向量

op: 取值为CV_REDUCE_SUM,CV_REDUCE_MAX,CV_REDUCE_MIN,CV_REDUCE_AVG之一

dtype: 目的向量类型

(2)void gemm(InputArray src1, InputArray src2, double alpha, InputArray src3, double gamma, OutputArray dst, int flags=0)

其英文注释:implements generalized matrix product algorithm GEMM from BLAS.

函数的作用:实现广义矩阵乘法,只对最后一个参数进行说明

flags: 取值为GEMM_1_T,GEMM_2_T,GEMM_3_T之1或者它们的组合,例如取值为GEMM_1_T则进行乘法之前对src1进行转置,所有函数作用可由以下公式来说明:

dst=alpha*op(src1)*op(src2)+gamma*op(src3),其中op(X)是X还是XT由flags确定。

(3)void mulTransposed( InputArray src, OutputArray dst, bool aTa, InputArray delta=noArray(), double scale=1, int dtype=-1 )

其英文注释:multiplies matrix by its transposition from the left or from the right.

函数的作用:矩阵左乘或右乘其转置矩阵,参数意思如下:

src: 原矩阵

dst: 目的矩阵

ata: 乘法顺序,true AT*A false A*AT

delta:在进行乘法前src先减去该数组

scale:乘法之后对结果进行scale倍缩放

dtype:目的矩阵类型

当ata为真时可用公式 dst=(src-delta)T*(src-delta)*scale 来说明函数的作用,该函数内部调用了函数(2)

(4)void calcCovarMatrix( InputArray samples, OutputArray covar, OutputArray mean, int flags, int ctype=CV_64F)

其英文注释:computes covariation matrix of a set of samples

函数作用:计算矩阵行向量或列向量的协方差矩阵,该函数中会调用函数(3)来实现相应功能

(5)bool eigen(InputArray src, OutputArray eigenvalues, OutputArray eigenvectors, int lowindex=-1, int highindex=-1)

其英文解释:finds eigenvalues and eigenvectors of a symmetric matrix

函数作用:求对称矩阵的特征值和特征向量,在该函数中会利用Jacobi方法来求对称矩阵的特征值和特征向量

2.2 主要过程

特征脸EigenFace的思想是把人脸从像素空间变换到另一个空间,在另一个空间中做相似性计算,EigenFace选择的空间变换方法是PCA,就是大名鼎鼎的主成分分析。EigenFace方法利用PCA得到人脸分布的主成分,具体实现是对训练集中的所有人脸图像的协方差矩阵进行求特征值,特征值对应的特征向量就是所谓的“特征脸”,每个特征向量描述人脸的一种变化或者特征,所以每个人脸都可以表示为这些特征脸的线性组合。下面结合以AT&T人脸库(40个人每个人包含10个表情脸图像,共400个脸部图像,每个图像分辨率为92x112),取其中399个人脸为样本库,最后1个为待识别人脸,给出基于Eigen特征脸的人脸识别实现过程:

(1)将训练集中的每一个人脸图像数据都拉长成一行,并将他们组合在一起形成一个大矩阵A,则A的大小为399x10304,即399行10304列。

(2)将399个人脸每个人脸对应的维度数据相加,然后求平均值,得到平均值向量Mean1x10304,将矩阵A的每一行都减去平均值向量得到差值矩阵B。

(3)计算协方差矩阵C=B*BT,C的维度是399x399,再对C求特征值λi,及特征向量ei,0<=i<399。

(4)上一步骤中其实并不是真正的人脸取样集协方差矩阵,因为人脸取样的维度是10304,而协方差矩阵反应的是各个维度之前的相关性,所以人脸取样集真正的协方差矩阵是C'=CT=BT*B,如果vi是C'的第i个特征向量,可以证明λi同样是C'的特征值,且vi=BT*ei(vi是10304行列向量),证明如下:

C*eii*ei  =>  B*BT*eii*ei  =>  BT*B*BT*eii*BT*ei  =>  C'*vii*vi

特征向量vi即为“特征脸”,所有特征向量组成特征向量矩阵V10304*399,则对于任意人脸向量α,将它与特征向量矩阵V相乘,将得到向量α在各个特征向量的投影,即α*V所得向量的每一个元素为α在对应“特征脸”的投影,在进行识别时,先求得待识别人脸向量在“特征脸”的投影向量,之后和每个样本脸的投影向量进行相似度比较,相似度最低者为最佳匹配。

2.3 核心源码

代码取自Opencv2.4.9

 void Eigenfaces::train(InputArrayOfArrays _src, InputArray _local_labels) {
if(_src.total() == ) {
string error_message = format("Empty training data was given. You'll need more than one sample to learn a model.");
CV_Error(CV_StsBadArg, error_message);
} else if(_local_labels.getMat().type() != CV_32SC1) {
string error_message = format("Labels must be given as integer (CV_32SC1). Expected %d, but was %d.", CV_32SC1, _local_labels.type());
CV_Error(CV_StsBadArg, error_message);
}
// make sure data has correct size
if(_src.total() > ) {
for(int i = ; i < static_cast<int>(_src.total()); i++) {
if(_src.getMat(i-).total() != _src.getMat(i).total()) {
string error_message = format("In the Eigenfaces method all input samples (training images) must be of equal size! Expected %d pixels, but was %d pixels.", _src.getMat(i-).total(), _src.getMat(i).total());
CV_Error(CV_StsUnsupportedFormat, error_message);
}
}
}
// get labels
Mat labels = _local_labels.getMat();
// observations in row
Mat data = asRowMatrix(_src, CV_64FC1); // number of samples
int n = data.rows;
// assert there are as much samples as labels
if(static_cast<int>(labels.total()) != n) {
string error_message = format("The number of samples (src) must equal the number of labels (labels)! len(src)=%d, len(labels)=%d.", n, labels.total());
CV_Error(CV_StsBadArg, error_message);
}
// clear existing model data
_labels.release();
_projections.clear();
// clip number of components to be valid
if((_num_components <= ) || (_num_components > n))
_num_components = n; // perform the PCA
PCA pca(data, Mat(), CV_PCA_DATA_AS_ROW, _num_components);
// copy the PCA results
_mean = pca.mean.reshape(,); // store the mean vector
_eigenvalues = pca.eigenvalues.clone(); // eigenvalues by row
transpose(pca.eigenvectors, _eigenvectors); // eigenvectors by column
// store labels for prediction
_labels = labels.clone();
// save projections
for(int sampleIdx = ; sampleIdx < data.rows; sampleIdx++) {
Mat p = subspaceProject(_eigenvectors, _mean, data.row(sampleIdx));
_projections.push_back(p);
}
}

人脸样本训练过程

38行的PCA类中实现了求样本矩阵的协方差矩阵、求协方差矩阵特征向量等核心功能,47行_mean为人脸平均值向量,该行其实是求每一个人脸向量减去平均值向量在“特征脸”集上的投影向量。

 PCA& PCA::operator()(InputArray _data, InputArray __mean, int flags, int maxComponents)
{
Mat data = _data.getMat(), _mean = __mean.getMat();
int covar_flags = CV_COVAR_SCALE;
int i, len, in_count;
Size mean_sz; CV_Assert( data.channels() == );
if( flags & CV_PCA_DATA_AS_COL )
{
len = data.rows;
in_count = data.cols;
covar_flags |= CV_COVAR_COLS;
mean_sz = Size(, len);
}
else
{
len = data.cols;
in_count = data.rows;
covar_flags |= CV_COVAR_ROWS;
mean_sz = Size(len, );
} int count = std::min(len, in_count), out_count = count;
if( maxComponents > )
out_count = std::min(count, maxComponents); // "scrambled" way to compute PCA (when cols(A)>rows(A)):
// B = A'A; B*x=b*x; C = AA'; C*y=c*y -> AA'*y=c*y -> A'A*(A'*y)=c*(A'*y) -> c = b, x=A'*y
if( len <= in_count )
covar_flags |= CV_COVAR_NORMAL; int ctype = std::max(CV_32F, data.depth());
mean.create( mean_sz, ctype ); Mat covar( count, count, ctype ); if( _mean.data )
{
CV_Assert( _mean.size() == mean_sz );
_mean.convertTo(mean, ctype);
covar_flags |= CV_COVAR_USE_AVG;
} calcCovarMatrix( data, covar, mean, covar_flags, ctype );
eigen( covar, eigenvalues, eigenvectors ); if( !(covar_flags & CV_COVAR_NORMAL) )
{
// CV_PCA_DATA_AS_ROW: cols(A)>rows(A). x=A'*y -> x'=y'*A
// CV_PCA_DATA_AS_COL: rows(A)>cols(A). x=A''*y -> x'=y'*A'
Mat tmp_data, tmp_mean = repeat(mean, data.rows/mean.rows, data.cols/mean.cols);
if( data.type() != ctype || tmp_mean.data == mean.data )
{
data.convertTo( tmp_data, ctype );
subtract( tmp_data, tmp_mean, tmp_data );
}
else
{
subtract( data, tmp_mean, tmp_mean );
tmp_data = tmp_mean;
} Mat evects1(count, len, ctype);
gemm( eigenvectors, tmp_data, , Mat(), , evects1,
(flags & CV_PCA_DATA_AS_COL) ? CV_GEMM_B_T : );
eigenvectors = evects1; // normalize eigenvectors
for( i = ; i < out_count; i++ )
{
Mat vec = eigenvectors.row(i);
normalize(vec, vec);
}
} if( count > out_count )
{
// use clone() to physically copy the data and thus deallocate the original matrices
eigenvalues = eigenvalues.rowRange(,out_count).clone();
eigenvectors = eigenvectors.rowRange(,out_count).clone();
}
return *this;
}

PCA类核心代码

45行求样本矩阵的协方差矩阵,46行求协方差矩阵的特征值及特征向量。

 void Eigenfaces::predict(InputArray _src, int &minClass, double &minDist) const {
// get data
Mat src = _src.getMat();
// make sure the user is passing correct data
if(_projections.empty()) {
// throw error if no data (or simply return -1?)
string error_message = "This Eigenfaces model is not computed yet. Did you call Eigenfaces::train?";
CV_Error(CV_StsError, error_message);
} else if(_eigenvectors.rows != static_cast<int>(src.total())) {
// check data alignment just for clearer exception messages
string error_message = format("Wrong input image size. Reason: Training and Test images must be of equal size! Expected an image with %d elements, but got %d.", _eigenvectors.rows, src.total());
CV_Error(CV_StsBadArg, error_message);
}
// project into PCA subspace
Mat q = subspaceProject(_eigenvectors, _mean, src.reshape(,));
minDist = DBL_MAX;
minClass = -;
for(size_t sampleIdx = ; sampleIdx < _projections.size(); sampleIdx++) {
double dist = norm(_projections[sampleIdx], q, NORM_L2);
if((dist < minDist) && (dist < _threshold)) {
minDist = dist;
minClass = _labels.at<int>((int)sampleIdx);
}
}
}

人脸识别核心代码

15行求待识别人脸向量减去人脸平均值向量在“特征脸”集上的投影向量X,19行求X与人脸样本投影向量的欧几里得距离(把此距离作为人脸相似度),20~23行取最小距离为识别结果。

3 示例代码

最后给出Eigen人脸识别的示例代码,代码中仍使用AT&T人脸库,其下载地址见上一篇随笔。

 #include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/contrib/contrib.hpp" #define CV_VERSION_ID CVAUX_STR(CV_MAJOR_VERSION) CVAUX_STR(CV_MINOR_VERSION) CVAUX_STR(CV_SUBMINOR_VERSION) #ifdef _DEBUG
#define cvLIB(name) "opencv_" name CV_VERSION_ID "d"
#else
#define cvLIB(name) "opencv_" name CV_VERSION_ID
#endif #pragma comment( lib, cvLIB("core") )
#pragma comment( lib, cvLIB("imgproc") )
#pragma comment( lib, cvLIB("highgui") )
#pragma comment( lib, cvLIB("flann") )
#pragma comment( lib, cvLIB("features2d") )
#pragma comment( lib, cvLIB("calib3d") )
#pragma comment( lib, cvLIB("gpu") )
#pragma comment( lib, cvLIB("legacy") )
#pragma comment( lib, cvLIB("ml") )
#pragma comment( lib, cvLIB("objdetect") )
#pragma comment( lib, cvLIB("ts") )
#pragma comment( lib, cvLIB("video") )
#pragma comment( lib, cvLIB("contrib") )
#pragma comment( lib, cvLIB("nonfree") ) #include <iostream>
#include <fstream>
#include <sstream> using namespace cv;
using namespace std; static Mat toGrayscale(InputArray _src) {
Mat src = _src.getMat();
// only allow one channel
if(src.channels() != ) {
CV_Error(CV_StsBadArg, "Only Matrices with one channel are supported");
}
// create and return normalized image
Mat dst;
cv::normalize(_src, dst, , , NORM_MINMAX, CV_8UC1);
return dst;
} static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
stringstream liness(line);
getline(liness, path, separator);
getline(liness, classlabel);
if(!path.empty() && !classlabel.empty()) {
images.push_back(imread(path, ));
labels.push_back(atoi(classlabel.c_str()));
}
}
} int main(int argc, const char *argv[]) {
// Check for valid command line arguments, print usage
// if no arguments were given.
if (argc != ) {
cout << "usage: " << argv[] << " <csv.ext>" << endl;
exit();
} // Get the path to your CSV.
string fn_csv = string(argv[]);
// These vectors hold the images and corresponding labels.
vector<Mat> images;
vector<int> labels;
// Read in the data. This can fail if no valid
// input filename is given.
try {
read_csv(fn_csv, images, labels);
} catch (cv::Exception& e) {
cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
// nothing more we can do
exit();
}
// Quit if there are not enough images for this demo.
if(images.size() <= ) {
string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
CV_Error(CV_StsError, error_message);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
// size:
int height = images[].rows;
// The following lines simply get the last images from
// your dataset and remove it from the vector. This is
// done, so that the training data (which we learn the
// cv::FaceRecognizer on) and the test data we test
// the model with, do not overlap.
Mat testSample = images[images.size() - ];
int testLabel = labels[labels.size() - ];
images.pop_back();
labels.pop_back();
// The following lines create an Eigenfaces model for
// face recognition and train it with the images and
// labels read from the given CSV file.
// This here is a full PCA, if you just want to keep
// 10 principal components (read Eigenfaces), then call
// the factory method like this:
//
// cv::createEigenFaceRecognizer(10);
//
// If you want to create a FaceRecognizer with a
// confidennce threshold, call it with:
//
// cv::createEigenFaceRecognizer(10, 123.0);
//
Ptr<FaceRecognizer> model = createEigenFaceRecognizer();
model->train(images, labels);
// The following line predicts the label of a given
// test image:
int predictedLabel = model->predict(testSample);
//
// To get the confidence of a prediction call the model with:
//
// int predictedLabel = -1;
// double confidence = 0.0;
// model->predict(testSample, predictedLabel, confidence);
//
string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
cout << result_message << endl;
// Sometimes you'll need to get/set internal model data,
// which isn't exposed by the public cv::FaceRecognizer.
// Since each cv::FaceRecognizer is derived from a
// cv::Algorithm, you can query the data.
//
// First we'll use it to set the threshold of the FaceRecognizer
// to 0.0 without retraining the model. This can be useful if
// you are evaluating the model:
//
model->set("threshold", 0.0);
// Now the threshold of this model is set to 0.0. A prediction
// now returns -1, as it's impossible to have a distance below
// it
predictedLabel = model->predict(testSample);
cout << "Predicted class = " << predictedLabel << endl;
// Here is how to get the eigenvalues of this Eigenfaces model:
Mat eigenvalues = model->getMat("eigenvalues");
// And we can do the same to display the Eigenvectors (read Eigenfaces):
Mat W = model->getMat("eigenvectors");
// From this we will display the (at most) first 10 Eigenfaces:
for (int i = ; i < min(, W.cols); i++) {
string msg = format("Eigenvalue #%d = %.5f", i, eigenvalues.at<double>(i));
cout << msg << endl;
// get eigenvector #i
Mat ev = W.col(i).clone();
// Reshape to original size & normalize to [0...255] for imshow.
Mat grayscale = toGrayscale(ev.reshape(, height));
// Show the image & apply a Jet colormap for better sensing.
Mat cgrayscale;
applyColorMap(grayscale, cgrayscale, COLORMAP_JET);
imshow(format("%d", i), cgrayscale);
}
waitKey(); return ;
}

程序运行结果及用伪彩色图像显示的前10个特征脸,如图所示:

OpenCV人脸识别Eigen算法源码分析OpenCV人脸识别Eigen算法源码分析

本博客参考了以下资料,一并致谢!

http://www.cnblogs.com/guoming0000/archive/2012/09/27/2706019.html

http://blog.csdn.net/zouxy09/article/details/45276053

http://blog.csdn.net/zhouxuguang236/article/details/40212143

http://wenku.baidu.com/view/6023207e168884868762d644.html

《数值分析简明教程》 王兵团 张作泉 赵平福 编著