图像处理与计算机视觉：基础，经典以及最近发展（3）计算机视觉中的信号处理与模式识别

Last Update: 2012-6-23

从本章开始，进入本文的核心章节。一共分三章，分别讲述信号处理与模式识别，图像处理与分析以及计算机视觉。与其说是讲述，不如说是一些经典文章的罗列以及自己的简单点评。与前一个版本不同的是，这次把所有的文章按类别归了类，并且增加了很多文献。分类的时候并没有按照传统的分类方法，而是划分成了一个个小的门类，比如SIFT，Harris都作为了单独的一类，虽然它们都可以划分到特征提取里面去。这样做的目的是希望能突出这些比较实用且比较流行的方法。为了以后维护的方法，按照字母顺序排的序。

本章的下载地址在：

http://iask.sina.com.cn/u/2252291285/ish?folderid=868770

1. Boosting

Boosting是最近十来年来最成功的一种模式识别方法之一，个人认为可以和SVM并称为模式识别双子星。它真正实现了“三个臭皮匠，赛过诸葛亮”。只要保证每个基本分类器的正确率超过50%，就可以实现组合成任意精度的分类器。这样就可以使用最简单的线性分类器。Boosting在计算机视觉中的最成功的应用无疑就是Viola-Jones提出的基于Haar特征的人脸检测方案。听起来似乎不可思议，但Haar+Adaboost确实在人脸检测上取得了巨大的成功，已经成了工业界的事实标准，并且逐步推广到其他物体的检测。

Rainer Lienhart在2002 ICIP发表的这篇文章是Haar+Adaboost的最好的扩展，他把原始的两个方向的Haar特征扩展到了四个方向，他本人是OpenCV积极的参与着。现在OpenCV的库里面实现的Cascade Classification就包含了他的方法。这也说明了盛会（如ICIP，ICPR，ICASSP）也有好文章啊，只要用心去发掘。

[1997] A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting

[1998] Boosting the margin A new explanation for the effectiveness of voting methods

[2002 ICIP TR] Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid ObjectDetection

[2003] The Boosting Approach to Machine Learning An Overview

[2004 IJCV] Robust Real-time Face Detection

2. Clustering

聚类主要有K均值聚类，谱聚类和模糊聚类。在聚类的时候如果自动确定聚类中心的数目是一个一直没有解决的问题。不过这也很正常，评价标准不同，得到的聚类中心数目也不一样。不过这方面还是有一些可以参考的文献，在使用的时候可以基于这些方法设计自己的准则。关于聚类，一般的模式识别书籍都介绍的比较详细，不过关于cluster validity讲的比较少，可以参考下面的文章看看。

[1989 PAMI] Unsupervised Optimal Fuzzy Clustering

[1991 PAMI] A validity measure for fuzzy clustering

[1995 PAMI] On cluster validity for the fuzzy c-means model

[1998] Some New Indexes of Cluster Validity

[1999 ACM] Data Clustering A Review

[1999 JIIS] On Clustering Validation Techniques

[2001] Estimating the number of clusters in a dataset via the Gap statistic

[2001 NIPS] On Spectral Clustering

[2002] A stability based method for discovering structure in clustered data

[2007] A tutorial on spectral clustering

3. Compressive Sensing

最近大红大紫的压缩感知理论。

[2006 TIT] Compressed Sensing

[2008 SPM] An Introduction to Compressive Sampling

[2011 TSP] Structured Compressed Sensing From Theory to Applications

4. Decision Trees

对决策树感兴趣的同学这篇文章是非看不可的了。

[1986] Introduction to Decision Trees

5. Dynamical Programming

动态规划也是一个比较使用的方法，这里挑选了一篇PAMI的文章以及一篇Book Chapter

[1990 PAMI] using dynamic programming for solving variational problems in vision

[Book Chapter] Dynamic Programming

6. Expectation Maximization

EM是计算机视觉中非常常见的一种方法，尤其是对参数的估计和拟合，比如高斯混合模型。EM和GMM在Bishop的PRML里单独的作为一章，讲的很不错。关于EM的tutorial，网上也可以搜到很多。

[1977] Maximum likelihood from incomplete data via the EM algorithm

[1996 SPM] The Expectation-Maximzation Algorithm

7. Graphical Models

伯克利的乔丹大仙的Graphical Model，可以配合这Bishop的PRML一起看。

[1999 ML] An Introduction to Variational Methods for Graphical Models

8. Hidden Markov Model

HMM在语音识别中发挥着巨大的作用。在信号处理和图像处理中也有一定的应用。最早接触它是跟小波和检索相关的，用HMM来描述小波系数之间的相互关系，并用来做检索。这里提供一篇1989年的经典综述，几篇HMM在小波，分割，检索和纹理上的应用以及一本比较早的中文电子书，现在也不知道作者是谁，在这里对作者表示感谢。

[1989 ] A tutorial on hidden markov models and selected applications in speech recognition

[1998 TSP] Wavelet-based statistical signal processing using hidden Markov models

[2001 TIP] Multiscale image segmentation using wavelet-domain hidden Markov models

[2002 TMM] Rotation invariant texture characterization and retrieval using steerable wavelet-domain hiddenMarkov models

[2003 TIP] Wavelet-based texture analysis and synthesis using hidden Markov models

Hmm Chinese book.pdf

9. Independent Component Analysis

同PCA一样，独立成分分析在计算机视觉中也发挥着重要的作用。这里介绍两篇综述性的文章，最后一篇是第二篇的TR版本，内容差不多，但比较清楚一些。

[1999] Independent Component Analysis A Tutorial

[2000 NN] Independent component analysis algorithms and applications

[2000] Independent Component Analysis Algorithms and Applications

10. Information Theory

计算机视觉中的信息论。这方面有一本很不错的书Information Theory in Computer Vision and Pattern Recognition。这本书有电子版，如果需要用到的话，也可以参考这本书。

[1995 NC] An Information-Maximization Approach to Blind Separation and Blind Deconvolution

[2010] An information theory perspective on computational vision

11. Kalman Filter

这个话题在张贤达老师的现代信号处理里面讲的比较深入，还给出了一个有趣的例子。这里列出了Kalman的最早的论文以及几篇综述，还有Unscented Kalman Filter。同时也有一篇Kalman Filter在跟踪中的应用以及两本电子书。

[1960 Kalman] A New Approach to Linear Filtering and Prediction Problems Kalman

[1970] Least-squares estimation_from Gauss to Kalman

[1997 SPIE] A New Extension of the Kalman Filter to Nonlinear System

[2000] The Unscented Kalman Filter for Nonlinear Estimation

[2001 Siggraph] An Introduction to the Kalman Filter_full

[2003] A Study of the Kalman Filter applied to Visual Tracking

12. Pattern Recognition and Machine Learning

模式识别名气比较大的几篇综述

[2000 PAMI] Statistical pattern recognition a review

[2004 CSVT] An Introduction to Biometric Recognition

[2010 SPM] Machine Learning in Medical Imaging

13. Principal Component Analysis

著名的PCA，在特征的表示和特征降维上非常有用。

[2001 PAMI] PCA versus LDA

[2001] Nonlinear component analysisas a kernel eigenvalue problem

[2002] A Tutorial on Principal Component Analysis

[2004 PAMI] Two-dimensional PCA a new approach to appearance-based face representation and recognition

[2009] A Tutorial on Principal Component Analysis

[2011] Robust Principal Component Analysis

[Book Chapter] Singular Value Decomposition and Principal Component Analysis

14. Random Forest

随机森林

[2001 ML] Random Forests

15. RANSAC

随机抽样一致性方法，与传统的最小均方误差等完全是两个路子。在Sonka的书里面也有提到。

[2009 BMVC] Performance Evaluation of RANSAC Family

16. Singular Value Decomposition

对于非方阵来说，就是SVD发挥作用的时刻了。一般的模式识别书都会介绍到SVD。这里列出了K-SVD以及一篇BookChapter

[2006 TSP] K-SVD An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation

[Book Chapter] Singular Value Decomposition and Principal Component Analysis

17. Sparse Representation

这里主要是Proceeding of IEEE上的几篇文章

[2009 PAMI] Robust Face Recognition via Sparse Representation

[2009 PIEEE] Image Decomposition and Separation Using Sparse Representations An Overview

[2010 PIEEE] Dictionaries for Sparse Representation Modeling

[2010 PIEEE] It's All About the Data

[2010 PIEEE] Matrix Completion With Noise

[2010 PIEEE] On the Role of Sparse and Redundant Representations in Image Processing

[2010 PIEEE] Sparse Representation for Computer Vision and Pattern Recognition

[2011 SPM] Directionary Learning

18. Support Vector Machines

[1998] A Tutorial on Support Vector Machines for Pattern Recognition

[2004] LIBSVM A Library for Support Vector Machines

19. Wavelet

在小波变换之前，时频分析的工具只有傅立叶变换。众所周知，傅立叶变换在时域没有分辨率，不能捕捉局部频域信息。虽然短时傅立叶变换克服了这个缺点，但只能刻画恒定窗口的频率特性，并且不能很好的扩展到二维。小波变换的出现很好的解决了时频分析的问题，作为一种多分辨率分析工具，在图像处理中得到了极大的发展和应用。在小波变换的发展过程中，有几个人是不得不提的，Mallat， Daubechies，Vetteri， M.N.Do， Swelden，Donoho。Mallat和Daubechies奠定了第一代小波的框架，他们的著作更是小波变换的必读之作，相对来说，小波十讲太偏数学了，比较难懂。而Mallat的信号处理的小波导引更偏应用一点。Swelden提出了第二代小波，使小波变换能够快速方便的实现，他的功劳有点类似于FFT。而Donoho，Vetteri，Mallat及其学生们提出了Ridgelet, Curvelet, Bandelet,Contourlet等几何小波变换，让小波变换有了方向性，更便于压缩，去噪等任务。尤其要提的是M.N.Do，他是一个越南人，得过IMO的银牌，在这个领域著作颇丰。我们国家每年都有5个左右的IMO金牌，希望也有一两个进入这个领域，能够也让我等也敬仰一下。而不是一股脑的都进入金融，管理这种跟数学没有多大关系的行业，呵呵。很希望能看到中国的陶哲轩，中国的M.N.Do。

说到小波，就不得不提JPEG2000。在JPEG2000中使用了Swelden和Daubechies提出的用提升算法实现的9/7小波和5/3小波。如果对比JPEG和JPEG2000，就会发现JPEG2000比JPEG在性能方面有太多的提升。本来我以为JPEG2000的普及只是时间的问题。但现在看来，这个想法太Naive了。现在已经过去十几年了，JPEG2000依然没有任何出头的迹象。不得不说，工业界的惯性力量太强大了。如果以前的东西没有什么硬伤的话，想改变太难了。不巧的是，JPEG2000的种种优点在最近的硬件上已经有了很大的提升。压缩率？现在动辄1T，2T的硬盘，没人太在意压缩率。渐进传输？现在的网速包括无线传输的速度已经相当快了，渐进传输也不是什么优势。感觉现在做图像压缩越来越没有前途了，从最近的会议和期刊文档也可以看出这个趋势。不管怎么说，JPEG2000的Overview还是可以看看的。

[1989 PAMI] A theory for multiresolution signal decomposition__the wavelet representation

[1996 PAMI] Image Representation using 2D Gabor Wavelet

[1998 ] FACTORING WAVELET TRANSFORMSIN TO LIFTING STEPS

[1998] The Lifting Scheme_ A Construction Of Second Generation Wavelets

[2000 TCE] The JPEG2000 still image coding system_ an overview

[2002 TIP] The curvelet transform for image denoising

[2003 TIP] Gray and color imagecontrast enhancement by the curvelet transform

[2003 TIP] Mathematical Properties of the jpeg2000 wavelet filters

[2003 TIP] The finite ridgelet transform for image representation

[2005 TIP] Sparse Geometric Image Representations With Bandelets

[2005 TIP] The Contourlet Transform_ An Efficient Directional Multiresolution Image Representation

[2010 SPM] The Curvelet Transform

秒客网