[总结]神经网络・压缩 compression(cnn,rnn)

时间:2021-11-22 13:55:09

distillation:

papers:

  1. NIPS2014_Distilling the Knowledge in a Neural Network_Hiton
    用一个大网络来教小网络学习.以100类的分类任务为例, 之前给的label只是说输入的样本是属于第几个类的.现在给的label是大网络的输出,对一个样本来说,它的label会是一个100维的向量,每一维对应这个样本属于该类的概率(soft targets).比如说,某个已经训练好的模型(3类:宝马车,垃圾车,胡萝卜)对一个宝马车的图(样本)进行分类, 可能得到 0.8,0.19,0.02. 注意后两类, 虽然都是错误的, 但是0.19要远大于0.01. 总之就是说, 已经训练好的模型会有更多的信息和区分能力啥的.
  2. FitNets: Hints for Thin Deep Nets
    用1中的soft targets, 用一个teacher网络去训练一个deeper and thinner网络.
  3. Dropout Distillation
  4. Distilling Knowledge to Specialist Networks for Clustered Classification

websites

论文笔记 《Distilling the Knowledge in a Neural Network》
论文笔记 《FitNets- Hints for Thin Deep Nets》
http://sei.pku.edu.cn/~luyy11/slides/slides_141231_ft_distill-nips14.pdf

compression

websites

Compressing and regularizing deep neural networks

papers

ICLR 2016 Best Paper: ICLR2016_Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

[总结]神经网络・压缩 compression(cnn,rnn)

LSTM

  1. NIPS2016_Phased LSTM: Accelerating Recurrent Network
    (大概是加入了一个新的time gate, 控制了更新频率,这样训练速度就快了?)
  2. RECURRENT NEURAL NETWORK TRAINING WITH DARK KNOWLEDGE TRANSFER
    用distillation的思想, 用一个cnn去训练一个lstm.