论文背景:
IEEE International Conference on Computer Vision 2015
Ziwei Liu1, Ping Luo1, Xiaogang Wang2, Xiaoou Tang1
1Department of Information Engineering, The Chinese University of *
2Department of Electronic Engineering, The Chinese University of *
论文贡献:
1.背景独立的情况下提升识别人脸的准确率,如下图与state_of_art的方案对比
2.识别人脸细节属性
3.开发者福音:提供了一个包含20万张标记了40个常用属性的人像数据库celebA(基于celebFace[1])和LFWA(基于LFW[2])
模型架构:
1.Lneto定位头部和肩部
2.Lnets进一步定位脸
3.Anet最后接全连接层进行属性预测
4.用SVM做多个全连接层的属性分类
具体网络结构,使用了参数局部共享和全局共享混合的策略:
More specifically, the network structures of LNeto and
LNets are the same as shown in Fig.3 (a) and (b), which
stack two max-pooling and five convolutional layers (C1 to
C5) with globally shared filters. These filters are recurrently
applied at every location of the image and are able to
account for large face translation and scaling. ANet stacks
four convolutional layers (C1 to C4), three max-pooling
layers, and one fully-connected layer (FC), where the filters
at C1 and C2 are globally shared, while the filters at C3
and C4 are locally shared. As shown in Fig.3 (c), the
response maps at C2 and C3 are divided into grids with
non-overlapping cells, each of which learns different filters.
The locally shared filters have been proved effective for
face related problems [24, 23], because they can capture
different information from different face parts. The network
structures are specified in Fig.3. For instance, the filters
at C1 of LNeto has 96 channels and the filter size in each
channel is 11113, as the input image xo contains three
color channels.
crop头像时可能会遭遇多目标检测问题,文章使用了每个位置求响应密度的空间距离的方法来解决
【1】Y. Sun, X. Wang, and X. Tang. Deep learning face
representation by joint identification-verification. In NIPS,
2014.
【2】G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller.
Labeled faces in the wild: A database for studying face
recognition in unconstrained environments. Technical Report
07-49, University of Massachusetts, Amherst, October
2007.
一点随想:这个结合生成模型,比如gan,可能可以做一件有趣的事:根据语义生成带属性的角色