机器学习中的神经网络Neural Networks for Machine Learning：Lecture 9 Quiz

Warning: The hard deadline has passed. You can attempt it, but you will not get credit for it. You are welcome to try it as a learning exercise.

Question 1

You are experimenting with two different models for a classification task. The figures below show the classification error you get as training progresses on the training data and the validation data for each of the two models. Which model do you think would perform better on previously unseen test data?.

机器学习中的神经网络Neural Networks for Machine Learning：Lecture 9 Quiz

Question 2

The figure below shows the histogram of weights for a learned Neural Network.
机器学习中的神经网络Neural Networks for Machine Learning：Lecture 9 Quiz

Which regularization technique has been used during learning?

L1 regularization

L2 regularization

adding weight noise

no regularization has been used

Question 3

Suppose you want to regularize the weights of a neural network during training so that lots of its weights are quite close to zero, but a few are a very long way from zero. Which cost function you would add to your objective function?

Question 4

In a linear regression task, a

d dimensional input vector

x is used to predict the output value

y using the weight vector

w where

y=wTx. The error function

E=12(t−wTx)2 where

t is the target output value. We want to use a student-t cost for the weights:

C=λ2∑di=1log(1+w2i).
The total error to be optimized

Etot=E+C. What is the expression for

∂Etot∂wi?

∂Etot∂wi=−(t−wixi)−2λwi(1+w2i)

∂Etot∂wi=−(t−wixi)−λwi

∂Etot∂wi=−(t−wTx)xi+λwi(1+w2i)

∂Etot∂wi=−(t−wTx)xi+λ(1+w2i)2

Question 5

Different regularization methods have different effects on the learning process. For example

L2 regularization penalizes high weight values.

L1 regularization penalizes weight values that do not equal zero. Adding noise to the weights during learning ensures that the learned hidden representations take extreme values. Sampling the hidden representations regularizes the network by pushing the hidden representation to be binary during the forward pass which limits the modeling capacity of the network. Given the shown histogram of activations (just before the nonlinear logistic nonlinearity) for a Neural Network, what is the regularization method that has been used (check all that apply)?
机器学习中的神经网络Neural Networks for Machine Learning：Lecture 9 Quiz

L1 regularization

L2 regularization

Sampling the hidden representation

Adding weight noise

Question 6

Suppose we have trained a neural network with one hidden layer and a single logistic output unit to predict whether or not an image contains a bird. If we retrain the network in the same way on the same data but using half as many hidden units, which of the following statements is true:

It will almost certainly do better on the test data.

It will almost certainly do worse on the training data.

It will almost certainly do worse on the test data.

It will almost certainly do better on the training data.

秒客网