Description
Error Backpropagation
What you need to get
• YOU_a2.ipynb: a Python notebook (hereafter called “the notebook”)
• Network.pyc: Module with solutions
What to do
1. [2 marks] The logistic function is defined as
.
Prove that
.
2. [4 marks] Consider a classification problem in which you have K classes. Suppose you have a labelled dataset containing pairs of inputs and class labels, (~x,`), where ~x ∈ RX and ` ∈ {1,2,…,K}.
Your neural network’s output is a classification vector based on the softmax activation function, so that if zk is the input current for output node k, then the activation of output node yk is
Thus, ~y ∈ [0,1]K, and yk = P (` = k |~x).
Suppose that your loss function is categorical cross entropy,
K
E(~y,~t) = −Xtk lnyk .
k=1
Derive an expression for , the gradient of the loss function with respect to the input current to the output layer.
3. [4 marks] Consider the network y = f (~x,θ) with output loss function E (y,t), where y,t ∈ R. Let the input current to the output layer be z, so that y = σ(z), and σ(·) is the activation function for the nodes in the output layer.
Derive an expression for for the following two combinations:
(a) Cross entropy, and logistic activation function.
E(y,t) = −tlny − (1 − t)ln(1 − y)
(b) Mean squared error, and identity activation function, σ(z) = z.
E(y,t) = (y − t)2
4. Implementing Backpropagation
For this question, you must complete the implementation of the Network class. Notice that the jupyter notebook has helper functions, a Layer class, and a Network class. Familiarize yourself with the code. Notice that a Network contains a series of Layers, as well as connection matrices. Many of the functionality is already completed, but there are a few functions that you have to finish. Follow these directions:
(a) FeedForward: Complete the function Network.FeedForward according to the specifications in its documentation. Note that your function must work for 2D input arrays containing multiple samples. The activities for each Layer should be stored in the corresponding Layer.h variable. [3 marks]
(b) CrossEntropy: Complete the Network.CrossEntropy function, which evaluates the average cross entropy between the supplied targets, and the activities of the top layer of the network. [1 mark]
(c) MSE: Complete the Network.MSE function, which evaluates the mean squared error between the supplied targets, and the activities of the top layer of the network. [1 mark]
(d) BackProp: Complete the Network.BackProp function, which uses the network state (after a feedforward pass) and the corresponding targets to compute the error gradients, and performs an update to the network weights and biases. [7 marks]
(e) Learn: Complete the Network.Learn function, which performs gradient descent over the training dataset to try to find the optimal network weights and biases. The function should perform the specified number of epochs, each time randomizing the order of the training samples (you can use the supplied function Shuffle for that).[4 marks]
For your convenience, the notebook includes two sample datasets for you to test your implementation on. One is a classification problem, and one is a regression problem. Your implementation should work with these lines of code. But I encourage you to tinker with them.
There is also a pre-compiled module with my implementation of all the code. This might be helpful for testing, or if you have trouble with one of the earlier functions, and would like to move on and implement one of the dependent functions.
Enjoy!
What to submit
Your assignment submission should be a single jupyter notebook file, named (<WatIAM>_a2.ipynb), where <WatIAM> is your UW WatIAM login ID (not your student number). The notebook must include solutions to all the questions. Submit this file to Desire2Learn. You do not need to submit any of the modules supplied for the assignment.
Reviews
There are no reviews yet.