3、Spark MLlib Deep Learning Convolution Neural Network(深度学习-卷积神经网络)3.2
第三章Convolution Neural Network (卷积神经网络)
2基础及源码解析
2.1 Convolution Neural Network卷积神经网络基础知识
1)基础知识:
自行google,百度,基础方面的非常多,随便看看就可以,只是很多没有把细节说得清楚和明白;
能把细节说清楚了讲明白了,可以参照下面2个文章,前提条件是你得先要有基础性的了解;
2)重点参照:
http://www.cnblogs.com/fengfenggirl/p/cnn_implement.html
http://www.cnblogs.com/tornadomeet/archive/2013/05/05/3061457.html
2.2 Deep Learning CNN源码解析
2.2.1 CNN代码结构
CNN源码主要包括:CNN,CNNModel两个类,源码结构如下:
CNN结构:
CNNModel结构:
2.2.2 CNN训练过程
2.2.3 CNN解析
(1) CNNLayers
/**
* types:网络层类别
* outputmaps:特征map数量
* kernelsize:卷积核k大小
* k: 卷积核
* b: 偏置
* dk: 卷积核的偏导
* db: 偏置的偏导
* scale: pooling大小
*/
caseclassCNNLayers(
types:String,
outputmaps:Double,
kernelsize:Double,
scale:Double,
k:Array[Array[BDM[Double]]],
b: Array[Double],
dk:Array[Array[BDM[Double]]],
db:Array[Double])extends Serializable
CNNLayers:自定义数据类型,存储网络每一层的参数信息。
(2) CnnSetup
卷积神经网络参数初始化,根据参数逐层构建CNN网络。
/** 卷积神经网络层参数初始化. */
defCnnSetup: (Array[CNNLayers], BDM[Double], BDM[Double], Double) = {
varinputmaps1=1.0
varmapsize1=mapsize
varconfinit= ArrayBuffer[CNNLayers]()
for(l <-0 tolayer -1) {// layer
valtype1=types(l)
valoutputmap1=outputmaps(l)
valkernelsize1=kernelsize(l)
valscale1=scale(l)
vallayersconf=if(type1=="s"){//每一层参数初始化
mapsize1 =mapsize1 /scale1
valb1 = Array.fill(inputmaps1.toInt)(0.0)
valki = Array(Array(BDM.zeros[Double](1,1)))
new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki,b1,ki,b1)
} elseif(type1=="c"){
mapsize1 =mapsize1 -kernelsize1+1.0
valfan_out =outputmap1* math.pow(kernelsize1,2)
valfan_in =inputmaps1* math.pow(kernelsize1,2)
valki = ArrayBuffer[Array[BDM[Double]]]()
for (i <-0toinputmaps1.toInt-1) {// input map
valkj = ArrayBuffer[BDM[Double]]()
for (j <-0tooutputmap1.toInt-1) {// output map
valkk = (BDM.rand[Double](kernelsize1.toInt,kernelsize1.toInt)-0.5)*2.0* sqrt(6.0/ (fan_in+fan_out))
kj +=kk
}
ki +=kj.toArray
}
valb1 = Array.fill(outputmap1.toInt)(0.0)
inputmaps1 =outputmap1
new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki.toArray,b1,ki.toArray,b1)
} else{
valki = Array(Array(BDM.zeros[Double](1,1)))
valb1 = Array(0.0)
new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki,b1,ki,b1)
}
confinit+=layersconf
}
valfvnum=mapsize1(0,0) * mapsize1(0,1) *inputmaps1
valffb= BDM.zeros[Double](onum,1)
valffW= (BDM.rand[Double](onum,fvnum.toInt)-0.5)*2.0* sqrt(6.0/ (onum+fvnum))
(confinit.toArray,ffb,ffW,alpha)
}
(3) expand
克罗内克积方法。
/**
* 克罗内克积
*
*/
defexpand(a: BDM[Double],s: Array[Int]): BDM[Double]= {
// val a = BDM((1.0, 2.0), (3.0,4.0), (5.0, 6.0))
// val s = Array(3, 2)
valsa = Array(a.rows, a.cols)
vartt =new Array[Array[Int]](sa.length)
for(ii <-sa.length -1 to0 by -1) {
varh =BDV.zeros[Int](sa(ii) * s(ii))
h(0 tosa(ii) * s(ii) -1 by s(ii)) :=1
tt(ii) = Accumulate(h).data
}
varb = BDM.zeros[Double](tt(0).length,tt(1).length)
for(j1 <-0 tob.rows -1) {
for(j2 <-0 tob.cols -1) {
b(j1,j2) = a(tt(0)(j1) -1, tt(1)(j2) -1)
}
}
b
}
(4) convn
卷积计算方法。
/**
* convn卷积计算
*/
defconvn(m0: BDM[Double],k0: BDM[Double],shape: String): BDM[Double]= {
//val m0 = BDM((1.0, 1.0, 1.0, 1.0),(0.0, 0.0, 1.0, 1.0), (0.0, 1.0, 1.0, 0.0), (0.0, 1.0, 1.0, 0.0))
//val k0 = BDM((1.0, 1.0), (0.0,1.0))
//val m0 = BDM((1.0, 1.0, 1.0),(1.0, 1.0, 1.0), (1.0, 1.0, 1.0))
//val k0 = BDM((1.0, 2.0, 3.0),(4.0, 5.0, 6.0), (7.0, 8.0, 9.0))
valout1= shapematch{
case"valid"=>
valm1 = m0
valk1 = k0.t
valrow1 =m1.rows -k1.rows +1
valcol1 =m1.cols -k1.cols +1
varm2 = BDM.zeros[Double](row1,col1)
for (i <-0torow1-1) {
for (j <-0tocol1-1) {
valr1 =i
valr2 =r1 +k1.rows -1
valc1 =j
valc2 =c1 +k1.cols -1
valmi =m1(r1 tor2,c1 toc2)
m2(i,j) = (mi :*k1).sum
}
}
m2
case"full"=>
varm1 = BDM.zeros[Double](m0.rows +2 * (k0.rows -1), m0.cols +2 * (k0.cols -1))
for (i <-0to m0.rows-1) {
for (j <-0to m0.cols-1) {
m1((k0.rows -1) +i, (k0.cols -1) +j) = m0(i,j)
}
}
valk1 = Rot90(Rot90(k0))
valrow1 =m1.rows -k1.rows +1
valcol1 =m1.cols -k1.cols +1
varm2 = BDM.zeros[Double](row1,col1)
for (i <-0torow1-1) {
for (j <-0tocol1-1) {
valr1 =i
valr2 =r1 +k1.rows -1
valc1 =j
valc2 =c1 +k1.cols -1
valmi =m1(r1 tor2,c1 toc2)
m2(i,j) = (mi :*k1).sum
}
}
m2
}
out1
}
(5) CNNtrain
对神经网络进行训练。
输入参数:train_d 训练RDD数据,opts训练参数。
输出:CNNModel,训练模型。
/**
* 运行卷积神经网络算法.
*/
defCNNtrain(train_d: RDD[(BDM[Double], BDM[Double])], opts: Array[Double]):CNNModel = {
valsc =train_d.sparkContext
varinitStartTime= System.currentTimeMillis()
varinitEndTime= System.currentTimeMillis()
// 参数初始化配置
var(cnn_layers,cnn_ffb,cnn_ffW,cnn_alpha)= CnnSetup
// 样本数据划分:训练数据、交叉检验数据
valvalidation= opts(2)
valsplitW1= Array(1.0-validation,validation)
valtrain_split1= train_d.randomSplit(splitW1, System.nanoTime())
valtrain_t=train_split1(0)
valtrain_v=train_split1(1)
// m:训练样本的数量
valm =train_t.count
// 计算batch的数量
valbatchsize= opts(0).toInt
valnumepochs= opts(1).toInt
valnumbatches= (m /batchsize).toInt
varrL = Array.fill(numepochs *numbatches.toInt)(0.0)
varn =0
// numepochs是循环的次数
for(i <-1 tonumepochs) {
initStartTime= System.currentTimeMillis()
valsplitW2= Array.fill(numbatches)(1.0 /numbatches)
// 根据分组权重,随机划分每组样本数据
for(l <-1 tonumbatches) {
// 权重
valbc_cnn_layers =sc.broadcast(cnn_layers)
valbc_cnn_ffb =sc.broadcast(cnn_ffb)
valbc_cnn_ffW =sc.broadcast(cnn_ffW)
// 样本划分
valtrain_split2 =train_t.randomSplit(splitW2, System.nanoTime())
valbatch_xy1 =train_split2(l -1)
// CNNff是进行前向传播
// net =cnnff(net, batch_x);
valtrain_cnnff = CNN.CNNff(batch_xy1,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)
// CNNbp是后向传播
// net =cnnbp(net, batch_y);
valtrain_cnnbp = CNN.CNNbp(train_cnnff,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)
// 权重更新
// net =cnnapplygrads(net,opts);
valtrain_nnapplygrads = CNN.CNNapplygrads(train_cnnbp,bc_cnn_ffb,bc_cnn_ffW,cnn_alpha)
cnn_ffW =train_nnapplygrads._1
cnn_ffb =train_nnapplygrads._2
cnn_layers =train_nnapplygrads._3
// error and loss
// 输出误差计算
// net.L = 1/2* sum(net.e(:) .^ 2) / size(net.e, 2);
valrdd_loss1 =train_cnnbp._1.map(f => f._5)
val (loss2,counte)=rdd_loss1.treeAggregate((0.0,0L))(
seqOp = (c, v) => {
// c: (e, count), v: (m)
vale1 = c._1
vale2 = (v :* v).sum
valesum =e1 +e2
(esum, c._2 +1)
},
combOp = (c1, c2) => {
// c: (e, count)
vale1 = c1._1
vale2 = c2._1
valesum =e1 +e2
(esum, c1._2 + c2._2)
})
valLoss = (loss2/counte.toDouble)*0.5
if (n ==0) {
rL(n) =Loss
} else {
rL(n) =0.09*rL(n -1) +0.01 *Loss
}
n =n +1
}
initEndTime= System.currentTimeMillis()
// 打印输出结果
printf("epoch: numepochs = %d , Took = %dseconds; batch train mse = %f.\n",i, scala.math.ceil((initEndTime -initStartTime).toDouble /1000).toLong,rL(n -1))
}
// 计算训练误差及交叉检验误差
// Full-batch trainmse
varloss_train_e=0.0
varloss_val_e=0.0
loss_train_e= CNN.CNNeval(train_t,sc.broadcast(cnn_layers),sc.broadcast(cnn_ffb),sc.broadcast(cnn_ffW))
if(validation>0)loss_val_e = CNN.CNNeval(train_v,sc.broadcast(cnn_layers),sc.broadcast(cnn_ffb),sc.broadcast(cnn_ffW))
printf("epoch: Full-batch train mse = %f, valmse = %f.\n",loss_train_e,loss_val_e)
newCNNModel(cnn_layers,cnn_ffW,cnn_ffb)
}
(6) CNNff
前向传播计算,计算每层输出,从输入层->隐含层->输出层,计算每一层每一个节点的输出值。
输入参数:
batch_xy1 样本数据
bc_cnn_layers 每层的参数
bc_cnn_ffb 偏置参数
bc_cnn_ffW 权重参数
输出:
每一层的计算结果。
/**
* cnnff是进行前向传播
* 计算神经网络中的每个节点的输出值;
*/
defCNNff(
batch_xy1: RDD[(BDM[Double], BDM[Double])],
bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],
bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],
bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]):RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double], BDM[Double])] = {
// 第1层:a(1)=[x]
valtrain_data1= batch_xy1.map { f =>
vallable= f._1
valfeatures= f._2
valnna1= Array(features)
valnna= ArrayBuffer[Array[BDM[Double]]]()
nna+=nna1
(lable,nna)
}
// 第2至n-1层计算
valtrain_data2=train_data1.map{ f =>
vallable= f._1
valnn_a= f._2
varinputmaps1=1.0
valn =bc_cnn_layers.value.length
// for each layer
for(l <-1 ton -1) {
valtype1 = bc_cnn_layers.value(l).types
valoutputmap1 = bc_cnn_layers.value(l).outputmaps
valkernelsize1 = bc_cnn_layers.value(l).kernelsize
valscale1 = bc_cnn_layers.value(l).scale
valk1 = bc_cnn_layers.value(l).k
valb1 = bc_cnn_layers.value(l).b
valnna1 = ArrayBuffer[BDM[Double]]()
if (type1 =="c"){
for (j <-0tooutputmap1.toInt-1) {// output map
// createtemp output map
varz = BDM.zeros[Double](nn_a(l -1)(0).rows -kernelsize1.toInt + 1, nn_a(l -1)(0).cols -kernelsize1.toInt + 1)
for (i <-0toinputmaps1.toInt-1) {// input map
// convolve with corresponding kernel and add to temp outputmap
// z = z + convn(net.layers{l - 1}.a{i}, net.layers{l}.k{i}{j},'valid');
z = z + convn(nn_a(l -1)(i),k1(i)(j),"valid")
}
// add bias, pass through nonlinearity
// net.layers{l}.a{j} =sigm(z + net.layers{l}.b{j})
valnna0 = sigm(z +b1(j))
nna1 +=nna0
}
nn_a +=nna1.toArray
inputmaps1 =outputmap1
} elseif (type1=="s"){
for (j <-0toinputmaps1.toInt-1) {
// z =convn(net.layers{l - 1}.a{j}, ones(net.layers{l}.scale) /(net.layers{l}.scale ^ 2), 'valid'); replace with variable
// net.layers{l}.a{j} = z(1 : net.layers{l}.scale : end, 1 :net.layers{l}.scale : end, :);
valz = convn(nn_a(l -1)(j), BDM.ones[Double](scale1.toInt,scale1.toInt) / (scale1 *scale1),"valid")
valzs1 =z(::,0 to -1 byscale1.toInt).t +0.0
valzs2 =zs1(::,0 to -1 byscale1.toInt).t +0.0
valnna0 =zs2
nna1 +=nna0
}
nn_a +=nna1.toArray
}
}
// concatenate all end layer feature mapsinto vector
valnn_fv1= ArrayBuffer[Double]()
for(j <-0 tonn_a(n -1).length -1) {
nn_fv1 ++=nn_a(n -1)(j).data
}
valnn_fv=newBDM[Double](nn_fv1.length,1,nn_fv1.toArray)
// feedforward into outputperceptrons
// net.o =sigm(net.ffW * net.fv +repmat(net.ffb,1, size(net.fv, 2)));
valnn_o= sigm(bc_cnn_ffW.value *nn_fv + bc_cnn_ffb.value)
(lable,nn_a.toArray,nn_fv,nn_o)
}
train_data2
}
(7) CNNbp
后向传播计算,计算每层导数,输出层->隐含层->输入层,计算每个节点的偏导数,也即误差反向传播。
输入参数:
train_cnnff 前向计算结果
bc_cnn_layers 每层的参数
bc_cnn_ffb 偏置参数
bc_cnn_ffW 权重参数
输出:
每一层的偏导数计算结果。
/**
* CNNbp是后向传播
* 计算权重的平均偏导数
*/
defCNNbp(
train_cnnff: RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double],BDM[Double])],
bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],
bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],
bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]):(RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double], BDM[Double],BDM[Double], BDM[Double], BDM[Double], Array[Array[BDM[Double]]])],BDM[Double], BDM[Double], Array[CNNLayers]) = {
// error : net.e = net.o - y
valn =bc_cnn_layers.value.length
valtrain_data3= train_cnnff.map { f =>
valnn_e= f._4 - f._1
(f._1, f._2, f._3, f._4,nn_e)
}
// backprop deltas
// 输出层的灵敏度或者残差
// net.od = net.e .* (net.o .* (1 - net.o))
// net.fvd = (net.ffW' * net.od)
valtrain_data4=train_data3.map{ f =>
valnn_e= f._5
valnn_o= f._4
valnn_fv= f._3
valnn_od=nn_e:* (nn_o:* (1.0-nn_o))
valnn_fvd=if(bc_cnn_layers.value(n -1).types =="c") {
// net.fvd = net.fvd .* (net.fv .* (1 - net.fv));
valnn_fvd1 = bc_cnn_ffW.value.t *nn_od
valnn_fvd2 =nn_fvd1:* (nn_fv:* (1.0-nn_fv))
nn_fvd2
} else{
valnn_fvd1 = bc_cnn_ffW.value.t *nn_od
nn_fvd1
}
(f._1, f._2, f._3, f._4, f._5,nn_od,nn_fvd)
}
// reshape feature vector deltas intooutput map style
valsa1=train_data4.map(f=> f._2(n -1)(1)).take(1)(0).rows
valsa2=train_data4.map(f=> f._2(n -1)(1)).take(1)(0).cols
valsa3=1
valfvnum=sa1*sa2
valtrain_data5=train_data4.map{ f =>
valnn_a= f._2
valnn_fvd= f._7
valnn_od= f._6
valnn_fv= f._3
varnnd=newArray[Array[BDM[Double]]](n)
valnnd1= ArrayBuffer[BDM[Double]]()
for(j <-0 tonn_a(n -1).length -1) {
valtmp1 =nn_fvd((j *fvnum) to ((j +1) *fvnum -1),0)
valtmp2 =newBDM(sa1,sa2,tmp1.data)
nnd1 +=tmp2
}
nnd(n -1) =nnd1.toArray
for(l <- (n -2) to0 by -1) {
valtype1 = bc_cnn_layers.value(l).types
varnnd2 = ArrayBuffer[BDM[Double]]()
if (type1 =="c"){
for (j <-0tonn_a(l).length -1) {
valtmp_a =nn_a(l)(j)
valtmp_d =nnd(l +1)(j)
valtmp_scale = bc_cnn_layers.value(l +1).scale.toInt
valtmp1 =tmp_a:* (1.0-tmp_a)
valtmp2 = expand(tmp_d,Array(tmp_scale,tmp_scale))/ (tmp_scale.toDouble*tmp_scale)
nnd2 += (tmp1 :*tmp2)
}
} elseif (type1=="s"){
for (i <-0tonn_a(l).length -1) {
varz = BDM.zeros[Double](nn_a(l)(0).rows,nn_a(l)(0).cols)
for (j <-0tonn_a(l +1).length -1) {
// z = z + convn(net.layers{l + 1}.d{j}, rot180(net.layers{l +1}.k{i}{j}), 'full');
z = z + convn(nnd(l +1)(j),Rot90(Rot90(bc_cnn_layers.value(l +1).k(i)(j))),"full")
}
nnd2 +=z
}
}
nnd(l) =nnd2.toArray
}
(f._1, f._2, f._3, f._4, f._5, f._6, f._7, nnd)
}
// dk db calcgradients
varcnn_layers= bc_cnn_layers.value
for(l <-1 ton -1) {
valtype1= bc_cnn_layers.value(l).types
vallena1=train_data5.map(f=> f._2(l).length).take(1)(0)
vallena2=train_data5.map(f=> f._2(l -1).length).take(1)(0)
if(type1=="c"){
for (j <-0tolena1-1) {
for (i <-0tolena2-1) {
valrdd_dk_ij =train_data5.map{ f =>
valnn_a = f._2
valnn_d = f._8
valtmp_d =nn_d(l)(j)
valtmp_a =nn_a(l -1)(i)
convn(Rot90(Rot90(tmp_a)),tmp_d,"valid")
}
valinitdk = BDM.zeros[Double](rdd_dk_ij.take(1)(0).rows,rdd_dk_ij.take(1)(0).cols)
val (dk_ij,count_dk)=rdd_dk_ij.treeAggregate((initdk,0L))(
seqOp = (c, v) => {
// c: (m, count), v: (m)
valm1 = c._1
valm2 =m1 + v
(m2, c._2 +1)
},
combOp = (c1, c2) => {
// c: (m, count)
valm1 = c1._1
valm2 = c2._1
valm3 =m1 +m2
(m3, c1._2 + c2._2)
})
valdk =dk_ij/count_dk.toDouble
cnn_layers(l).dk(i)(j) =dk
}
valrdd_db_j =train_data5.map{ f =>
valnn_d = f._8
valtmp_d =nn_d(l)(j)
Bsum(tmp_d)
}
valdb_j =rdd_db_j.reduce(_+ _)
valcount_db =rdd_db_j.count
valdb =db_j/count_db.toDouble
cnn_layers(l).db(j) =db
}
}
}
// net.dffW = net.od * (net.fv)' /size(net.od, 2);
// net.dffb = mean(net.od, 2);
valtrain_data6=train_data5.map{ f =>
valnn_od= f._6
valnn_fv= f._3
nn_od*nn_fv.t
}
valtrain_data7=train_data5.map{ f =>
valnn_od= f._6
nn_od
}
valinitffW= BDM.zeros[Double](bc_cnn_ffW.value.rows, bc_cnn_ffW.value.cols)
val(ffw2,countfffw2)=train_data6.treeAggregate((initffW,0L))(
seqOp = (c, v) => {
// c: (m, count), v: (m)
valm1 = c._1
valm2 =m1 + v
(m2, c._2 +1)
},
combOp = (c1, c2) => {
// c: (m, count)
valm1 = c1._1
valm2 = c2._1
valm3 =m1 +m2
(m3, c1._2 + c2._2)
})
valcnn_dffw=ffw2/countfffw2.toDouble
valinitffb= BDM.zeros[Double](bc_cnn_ffb.value.rows, bc_cnn_ffb.value.cols)
val(ffb2,countfffb2)=train_data7.treeAggregate((initffb,0L))(
seqOp = (c, v) => {
// c: (m, count), v: (m)
valm1 = c._1
valm2 =m1 + v
(m2, c._2 +1)
},
combOp = (c1, c2) => {
// c: (m, count)
valm1 = c1._1
valm2 = c2._1
valm3 =m1 +m2
(m3, c1._2 + c2._2)
})
valcnn_dffb=ffb2/countfffb2.toDouble
(train_data5,cnn_dffw,cnn_dffb,cnn_layers)
}
(8) CNNapplygrads
权重更新。
输入参数:
train_cnnbp:CNNbp输出值
bc_cnn_ffb:神经网络偏置参数
bc_cnn_ffW:神经网络权重参数
alpha:更新的学习率
输出参数:(cnn_ffW, cnn_ffb, cnn_layers)更新后权重参数。
/**
* NNapplygrads是权重更新
* 权重更新
*/
defCNNapplygrads(
train_cnnbp: (RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double],BDM[Double], BDM[Double], BDM[Double], BDM[Double],Array[Array[BDM[Double]]])], BDM[Double], BDM[Double], Array[CNNLayers]),
bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],
bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]],
alpha: Double): (BDM[Double], BDM[Double], Array[CNNLayers]) = {
valtrain_data5= train_cnnbp._1
valcnn_dffw= train_cnnbp._2
valcnn_dffb= train_cnnbp._3
varcnn_layers= train_cnnbp._4
varcnn_ffb= bc_cnn_ffb.value
varcnn_ffW= bc_cnn_ffW.value
valn =cnn_layers.length
for(l <-1 ton -1) {
valtype1=cnn_layers(l).types
vallena1=train_data5.map(f=> f._2(l).length).take(1)(0)
vallena2=train_data5.map(f=> f._2(l -1).length).take(1)(0)
if(type1=="c"){
for (j <-0tolena1-1) {
for (ii <-0tolena2-1) {
cnn_layers(l).k(ii)(j) =cnn_layers(l).k(ii)(j) -cnn_layers(l).dk(ii)(j)
}
cnn_layers(l).b(j) =cnn_layers(l).b(j) -cnn_layers(l).db(j)
}
}
}
cnn_ffW=cnn_ffW+cnn_dffw
cnn_ffb=cnn_ffb+cnn_dffb
(cnn_ffW,cnn_ffb,cnn_layers)
}
(9) CNNeval
误差计算。
/**
* nneval是进行前向传播并计算输出误差
* 计算神经网络中的每个节点的输出值,并计算平均误差;
*/
defCNNeval(
batch_xy1: RDD[(BDM[Double], BDM[Double])],
bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],
bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],
bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]): Double ={
// CNNff是进行前向传播
valtrain_cnnff= CNN.CNNff(batch_xy1, bc_cnn_layers, bc_cnn_ffb, bc_cnn_ffW)
// error and loss
// 输出误差计算
valrdd_loss1=train_cnnff.map{ f =>
valnn_e= f._4 - f._1
nn_e
}
val(loss2,counte)=rdd_loss1.treeAggregate((0.0,0L))(
seqOp = (c, v) => {
// c: (e, count), v: (m)
vale1 = c._1
vale2 = (v :* v).sum
valesum =e1 +e2
(esum, c._2 +1)
},
combOp = (c1, c2) => {
// c: (e, count)
vale1 = c1._1
vale2 = c2._1
valesum =e1 +e2
(esum, c1._2 + c2._2)
})
valLoss= (loss2/counte.toDouble)*0.5
Loss
}
2.2.4 CNNModel解析
(1) CNNModel
CNNModel:存储CNN网络参数,包括:cnn_layers每一层的配置参数,cnn_ffW权重,dbn_b偏置,cnn_ffb偏置。
class CNNModel(
valcnn_layers:Array[CNNLayers],
valcnn_ffW:BDM[Double],
valcnn_ffb: BDM[Double])extends Serializable {
}
(2) predict
predict:根据模型进行预测计算。
/**
* 返回预测结果
* 返回格式:(label, feature, predict_label, error)
*/
defpredict(dataMatrix: RDD[(BDM[Double], BDM[Double])]): RDD[PredictCNNLabel] = {
valsc =dataMatrix.sparkContext
valbc_cnn_layers=sc.broadcast(cnn_layers)
valbc_cnn_ffW=sc.broadcast(cnn_ffW)
valbc_cnn_ffb=sc.broadcast(cnn_ffb)
// CNNff是进行前向传播
valtrain_cnnff= CNN.CNNff(dataMatrix,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)
valrdd_predict=train_cnnff.map{ f =>
vallabel= f._1
valnna1= f._2(0)(0)
valnnan= f._4
valerror= f._4 - f._1
PredictCNNLabel(label,nna1,nnan,error)
}
rdd_predict
}
(3) Loss
Loss:根据预测结果计算误差。
/**
* 计算输出误差
* 平均误差;
*/
defLoss(predict: RDD[PredictCNNLabel]): Double = {
valpredict1= predict.map(f => f.error)
// error and loss
// 输出误差计算
valloss1=predict1
val(loss2,counte)=loss1.treeAggregate((0.0,0L))(
seqOp = (c, v) => {
// c: (e, count), v: (m)
vale1 = c._1
vale2 = (v :* v).sum
valesum =e1 +e2
(esum, c._2 +1)
},
combOp = (c1, c2) => {
// c: (e, count)
vale1 = c1._1
vale2 = c2._1
valesum =e1 +e2
(esum, c1._2 + c2._2)
})
valLoss= (loss2/counte.toDouble)*0.5
Loss
}
转载请注明出处: