【DeepLearning】Exercise: Implement deep networks for digit classification

Exercise: Implement deep networks for digit classification

习题链接：Exercise: Implement deep networks for digit classification

stackedAEPredict.m

function [pred] = stackedAEPredict(theta, inputSize, hiddenSize, numClasses, netconfig, data)

% stackedAEPredict: Takes a trained theta and a test data set,

% and returns the predicted labels for each example.

% theta: trained weights from the autoencoder

% visibleSize: the number of input units

% hiddenSize:  the number of hidden units *at the 2nd layer*

% numClasses:  the number of categories

% data: Our matrix containing the training data as columns.  So, data(:,i) is the i-th training example. 

% Your code should produce the prediction matrix

% pred, where pred(i) is argmax_c P(y(c) | x(i)).

%% Unroll theta parameter

% We first extract the part which compute the softmax gradient

softmaxTheta = reshape(theta(:hiddenSize*numClasses), numClasses, hiddenSize);

% Extract out the "stack"

stack = params2stack(theta(hiddenSize*numClasses+:end), netconfig);

%% ---------- YOUR CODE HERE --------------------------------------

%  Instructions: Compute pred using theta assuming that the labels start

%                from .

numCases = size(data, );

% forward

z2 = stack{}.w * data + repmat(stack{}.b, , numCases);

a2 = sigmoid(z2);

z3 = stack{}.w * a2 + repmat(stack{}.b, , numCases);

a3 = sigmoid(z3);

[~, pred] = max(softmaxTheta * a3);

% -----------------------------------------------------------

end

% You might find this useful

function sigm = sigmoid(x)

    sigm =  ./ ( + exp(-x));

end

stackedAECost.m

function [ cost, grad ] = stackedAECost(theta, inputSize, hiddenSize, ...

                                              numClasses, netconfig, ...

                                              lambda, data, labels)

% stackedAECost: Takes a trained softmaxTheta and a training data set with labels,

% and returns cost and gradient using a stacked autoencoder model. Used for

% finetuning.

% theta: trained weights from the autoencoder

% visibleSize: the number of input units

% hiddenSize:  the number of hidden units *at the 2nd layer*

% numClasses:  the number of categories

% netconfig:   the network configuration of the stack

% lambda:      the weight regularization penalty

% data: Our matrix containing the training data as columns.  So, data(:,i) is the i-th training example.

% labels: A vector containing labels, where labels(i) is the label for the

% i-th training example

%% Unroll softmaxTheta parameter

% We first extract the part which compute the softmax gradient

softmaxTheta = reshape(theta(:hiddenSize*numClasses), numClasses, hiddenSize);

% Extract out the "stack"

stack = params2stack(theta(hiddenSize*numClasses+:end), netconfig);

% You will need to compute the following gradients

softmaxThetaGrad = zeros(size(softmaxTheta));

stackgrad = cell(size(stack));

for d = :numel(stack)

    stackgrad{d}.w = zeros(size(stack{d}.w));

    stackgrad{d}.b = zeros(size(stack{d}.b));

end

cost = ; % You need to compute this

% You might find these variables useful

numCases = size(data, );

groundTruth = full(sparse(labels, :numCases, ));

%% --------------------------- YOUR CODE HERE -----------------------------

%  Instructions: Compute the cost function and gradient vector for

%                the stacked autoencoder.

%

%                You are given a stack variable which is a cell-array of

%                the weights and biases for every layer. In particular, you

%                can refer to the weights of Layer d, using stack{d}.w and

%                the biases using stack{d}.b . To get the total number of

%                layers, you can use numel(stack).

%

%                The last layer of the network is connected to the softmax

%                classification layer, softmaxTheta.

%

%                You should compute the gradients for the softmaxTheta,

%                storing that in softmaxThetaGrad. Similarly, you should

%                compute the gradients for each layer in the stack, storing

%                the gradients in stackgrad{d}.w and stackgrad{d}.b

%                Note that the size of the matrices in stackgrad should

%                match exactly that of the size of the matrices in stack.

%

z2 = stack{}.w * data + repmat(stack{}.b, , numCases);

a2 = sigmoid(z2);

z3 = stack{}.w * a2 + repmat(stack{}.b, , numCases);

a3 = sigmoid(z3);

M = softmaxTheta * a3;

M = bsxfun(@minus, M, max(M, [], ));

M = exp(M);

M = bsxfun(@rdivide, M, sum(M));

diff = groundTruth - M;

cost = -(/numCases) * sum(sum(groundTruth .* log(M))) + (lambda/) * sum(sum(softmaxTheta .* softmaxTheta));

for i=:numClasses

    softmaxThetaGrad(i, :) = -(/numCases) * (sum(a3 .* repmat(diff(i, :), hiddenSize, ), ))' + lambda * softmaxTheta(i, :);

end

delta3 = - (softmaxTheta' * diff) .* sigmoiddiff(z3);

stackgrad{}.w = delta3 * (a2)' ./ numCases;

stackgrad{}.b = sum(delta3, )./ numCases;

delta2 = (stack{}.w' * delta3) .* sigmoiddiff(z2);

stackgrad{}.w = delta2 * data'./ numCases;

stackgrad{}.b = sum(delta2, )./ numCases;

% -------------------------------------------------------------------------

%% Roll gradient vector

grad = [softmaxThetaGrad(:) ; stack2params(stackgrad)];

end

% You might find this useful

function sigm = sigmoid(x)

    sigm =  ./ ( + exp(-x));

end

function sigmdiff = sigmoiddiff(x)

    sigmdiff = sigmoid(x) .* ( - sigmoid(x));

end

stackedAEExercise.m

%% CS294A/CS294W Stacked Autoencoder Exercise

%  Instructions

%  ------------

%

%  This file contains code that helps you get started on the

%  sstacked autoencoder exercise. You will need to complete code in

%  stackedAECost.m

%  You will also need to have implemented sparseAutoencoderCost.m and

%  softmaxCost.m from previous exercises. You will need the initializeParameters.m

%  loadMNISTImages.m, and loadMNISTLabels.m files from previous exercises.

%

%  For the purpose of completing the assignment, you do not need to

%  change the code in this file.

%

%%======================================================================

%% STEP : Here we provide the relevant parameters values that will

%  allow your sparse autoencoder to get good filters; you do not need to

%  change the parameters below.

inputSize =  * ;

numClasses = ;

hiddenSizeL1 = ;    % Layer  Hidden Size

hiddenSizeL2 = ;    % Layer  Hidden Size

sparsityParam = 0.1;   % desired average activation of the hidden units.

                       % (This was denoted by the Greek alphabet rho, which looks like a lower-case "p",

                       %  in the lecture notes).

lambda = 3e-;         % weight decay parameter

beta = ;              % weight of sparsity penalty term       

%%======================================================================

%% STEP : Load data from the MNIST database

%

%  This loads our training data from the MNIST database files.

% Load MNIST database files

trainData = loadMNISTImages('mnist/train-images-idx3-ubyte');

trainLabels = loadMNISTLabels('mnist/train-labels-idx1-ubyte');

trainLabels(trainLabels == ) = ; % Remap  to  since our labels need to start from 

%%======================================================================

%% STEP : Train the first sparse autoencoder

%  This trains the first sparse autoencoder on the unlabelled STL training

%  images.

%  If you've correctly implemented sparseAutoencoderCost.m, you don't need

%  to change anything here.

%  Randomly initialize the parameters

sae1Theta = initializeParameters(hiddenSizeL1, inputSize);

%% ---------------------- YOUR CODE HERE  ---------------------------------

%  Instructions: Train the first layer sparse autoencoder, this layer has

%                an hidden size of "hiddenSizeL1"

%                You should store the optimal parameters in sae1OptTheta

addpath minFunc/

options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost

                          % function. Generally, for minFunc to work, you

                          % need a function pointer with two outputs: the

                          % function value and the gradient. In our problem,

                          % sparseAutoencoderCost.m satisfies this.

options.maxIter = ;    % Maximum number of iterations of L-BFGS to run

options.display = 'on';

[sae1OptTheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...

                                   inputSize, hiddenSizeL1, ...

                                   lambda, sparsityParam, ...

                                   beta, trainData), ...

                              sae1Theta, options);

% -------------------------------------------------------------------------

%%======================================================================

%% STEP : Train the second sparse autoencoder

%  This trains the second sparse autoencoder on the first autoencoder

%  featurse.

%  If you've correctly implemented sparseAutoencoderCost.m, you don't need

%  to change anything here.

[sae1Features] = feedForwardAutoencoder(sae1OptTheta, hiddenSizeL1, ...

                                        inputSize, trainData);

%  Randomly initialize the parameters

sae2Theta = initializeParameters(hiddenSizeL2, hiddenSizeL1);

%% ---------------------- YOUR CODE HERE  ---------------------------------

%  Instructions: Train the second layer sparse autoencoder, this layer has

%                an hidden size of "hiddenSizeL2" and an inputsize of

%                "hiddenSizeL1"

%

%                You should store the optimal parameters in sae2OptTheta

options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost

                          % function. Generally, for minFunc to work, you

                          % need a function pointer with two outputs: the

                          % function value and the gradient. In our problem,

                          % sparseAutoencoderCost.m satisfies this.

options.maxIter = ;    % Maximum number of iterations of L-BFGS to run

options.display = 'on';

[sae2OptTheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...

                                   hiddenSizeL1, hiddenSizeL2, ...

                                   lambda, sparsityParam, ...

                                   beta, sae1Features), ...

                              sae2Theta, options);

% -------------------------------------------------------------------------

%%======================================================================

%% STEP : Train the softmax classifier

%  This trains the sparse autoencoder on the second autoencoder features.

%  If you've correctly implemented softmaxCost.m, you don't need

%  to change anything here.

[sae2Features] = feedForwardAutoencoder(sae2OptTheta, hiddenSizeL2, ...

                                        hiddenSizeL1, sae1Features);

%  Randomly initialize the parameters

saeSoftmaxTheta = 0.005 * randn(hiddenSizeL2 * numClasses, );

%% ---------------------- YOUR CODE HERE  ---------------------------------

%  Instructions: Train the softmax classifier, the classifier takes in

%                input of dimension "hiddenSizeL2" corresponding to the

%                hidden layer size of the 2nd layer.

%

%                You should store the optimal parameters in saeSoftmaxOptTheta

%

%  NOTE: If you used softmaxTrain to complete this part of the exercise,

%        set saeSoftmaxOptTheta = softmaxModel.optTheta(:);

options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost

                          % function. Generally, for minFunc to work, you

                          % need a function pointer with two outputs: the

                          % function value and the gradient. In our problem,

                          % softmaxCost.m satisfies this.

minFuncOptions.display = 'on';

[saeSoftmaxOptTheta, cost] = minFunc( @(p) softmaxCost(p, ...

                                   numClasses, hiddenSizeL2, lambda, ...

                                   sae2Features, trainLabels), ...

                              saeSoftmaxTheta, options);

% -------------------------------------------------------------------------

%%======================================================================

%% STEP : Finetune softmax model

% Implement the stackedAECost to give the combined cost of the whole model

% then run this cell.

% Initialize the stack using the parameters learned

stack = cell(,);

stack{}.w = reshape(sae1OptTheta(:hiddenSizeL1*inputSize), ...

                     hiddenSizeL1, inputSize);

stack{}.b = sae1OptTheta(*hiddenSizeL1*inputSize+:*hiddenSizeL1*inputSize+hiddenSizeL1);

stack{}.w = reshape(sae2OptTheta(:hiddenSizeL2*hiddenSizeL1), ...

                     hiddenSizeL2, hiddenSizeL1);

stack{}.b = sae2OptTheta(*hiddenSizeL2*hiddenSizeL1+:*hiddenSizeL2*hiddenSizeL1+hiddenSizeL2);

% Initialize the parameters for the deep model

[stackparams, netconfig] = stack2params(stack);

stackedAETheta = [ saeSoftmaxOptTheta ; stackparams ];

%% ---------------------- YOUR CODE HERE  ---------------------------------

%  Instructions: Train the deep network, hidden size here refers to the '

%                dimension of the input to the classifier, which corresponds

%                to "hiddenSizeL2".

%

%

options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost

                          % function. Generally, for minFunc to work, you

                          % need a function pointer with two outputs: the

                          % function value and the gradient. In our problem,

                          % softmaxCost.m satisfies this.

minFuncOptions.display = 'on';

[stackedAEOptTheta, cost] = minFunc( @(p) stackedAECost(p, ...

                                   inputSize, hiddenSizeL2, numClasses, ...

                                   netconfig, lambda, trainData, trainLabels), ...

                              stackedAETheta, options);

% -------------------------------------------------------------------------

%%======================================================================

%% STEP : Test

%  Instructions: You will need to complete the code in stackedAEPredict.m

%                before running this part of the code

%

% Get labelled test images

% Note that we apply the same kind of preprocessing as the training set

testData = loadMNISTImages('mnist/t10k-images-idx3-ubyte');

testLabels = loadMNISTLabels('mnist/t10k-labels-idx1-ubyte');

testLabels(testLabels == ) = ; % Remap  to 

[pred] = stackedAEPredict(stackedAETheta, inputSize, hiddenSizeL2, ...

                          numClasses, netconfig, testData);

acc = mean(testLabels(:) == pred(:));

fprintf('Before Finetuning Test Accuracy: %0.3f%%\n', acc * );

[pred] = stackedAEPredict(stackedAEOptTheta, inputSize, hiddenSizeL2, ...

                          numClasses, netconfig, testData);

acc = mean(testLabels(:) == pred(:));

fprintf('After Finetuning Test Accuracy: %0.3f%%\n', acc * );

% Accuracy is the proportion of correctly classified images

% The results for our implementation were:

%

% Before Finetuning Test Accuracy: 87.7%

% After Finetuning Test Accuracy:  97.6%

%

% If your values are too low (accuracy less than %), you should check

% your code for errors, and make sure you are training on the

% entire data set of  28x28 training images

% (unless you modified the loading code, this should be the case)

Before Finetuning Test Accuracy: 87.740%
After Finetuning Test Accuracy: 97.610%

【DeepLearning】Exercise: Implement deep networks for digit classification的更多相关文章

Deep Learning 8&lowbar;深度学习UFLDL教程：Stacked Autocoders and Implement deep networks for digit classification&lowbar;Exercise（斯坦福大学深度学习教程）
前言 1.理论知识:UFLDL教程.Deep learning:十六(deep networks) 2.实验环境:win7, matlab2015b,16G内存,2T硬盘 3.实验内容:Exercis ...
【DeepLearning】Exercise&colon;Convolution and Pooling
Exercise:Convolution and Pooling 习题链接:Exercise:Convolution and Pooling cnnExercise.m %% CS294A/CS294 ...
【DeepLearning】Exercise&colon;PCA and Whitening
Exercise:PCA and Whitening 习题链接:Exercise:PCA and Whitening pca_gen.m %%============================= ...
【DeepLearning】Exercise&colon;PCA in 2D
Exercise:PCA in 2D 习题的链接:Exercise:PCA in 2D pca_2d.m close all %%=================================== ...
【DeepLearning】Exercise&colon;Sparse Autoencoder
Exercise:Sparse Autoencoder 习题的链接:Exercise:Sparse Autoencoder 注意点: 1.训练样本像素值需要归一化. 因为输出层的激活函数是logist ...
【DeepLearning】Exercise&colon;Softmax Regression
Exercise:Softmax Regression 习题的链接:Exercise:Softmax Regression softmaxCost.m function [cost, grad] = ...
【DeepLearning】Exercise&colon;Learning color features with Sparse Autoencoders
Exercise:Learning color features with Sparse Autoencoders 习题链接:Exercise:Learning color features with ...
【DeepLearning】Exercise&colon;Self-Taught Learning
Exercise:Self-Taught Learning 习题链接:Exercise:Self-Taught Learning feedForwardAutoencoder.m function [ ...
【DeepLearning】Exercise&colon;Vectorization
Exercise:Vectorization 习题的链接:Exercise:Vectorization 注意点: MNIST图片的像素点已经经过归一化. 如果再使用Exercise:Sparse Au ...

随机推荐

xcode下编译&period;a文件的路径
http://www.it165.net/pro/html/201503/36842.html 每当我们编译之后, 实际上系统就给我编译好了一个可以运行的.app文件,在某个路径下如果我们建立的是静 ...
关于 ArtifactTransferException&colon; Failure to transfer
eclipse 在导入maven project后,pom.xml有可能出现这种错误. 这里update maven project解决了:右键点击Maven项目->Maven->Upda ...
stm32 DMA数据搬运 [操作寄存器+库函数]（转）
源:stm32 DMA数据搬运 [操作寄存器+库函数] DMA(Direct Memory Access)常译为“存储器直接存取”.早在Intel的8086平台上就有了DMA应用了. ...
Go map实现原理
map结构整体为一个数组,数组每个元素可以理解成一个槽,槽是一个链表结构,槽的每个节点可存8个元素,搞清楚了map的结构,想想对应的增删改查操作也不是那么难
java 下载网络文件
1.FileUtils.copyURLToFile实现: import java.io.File; import java.net.URL; import org.apache.commons.io. ...
我的Git
1.git 的安装与配置. 首先,对git进行下载.然后,在本地安装后进行版本查看,win10系统通过win+r快捷键打开控制台,然后用git --version的cmd命令查看git版本. 然后对g ...
【分治-前缀积后缀积】JS Window &commat;2018acm徐州邀请赛G
问题 G: JS Window 时间限制: 2 Sec 内存限制: 512 MB 题目描述 JSZKC has an array A of N integers. More over, he has ...
python语言程序设计9
1, 数字转换形式中有很多东西都不会,但是总不能放仍不管把? 总结点东西吧,比如 print()中增加end=""参数表示输入后不增加换行,多个print可以连续输出. 2, 我还 ...
8&period; Object转Map，Map转Object
法一:使用reflect进行转换 public static Object mapToObject(Map<String, Object> map, Class<?> bean ...
JavaScript——双向链表实现
本文版权归博客园和作者吴双本人共同所有,转载和爬虫请注明原文链接 http://www.cnblogs.com/tdws/ 下午分享了JavaScript实现单向链表,晚上就来补充下双向链表吧.对链表 ...