CS 446: Machine Learning Homework (Solution)

1. [8 points] Backpropagation
Consider the deep net in the figure below consisting of an input layer, an output layer, and a hidden layer. The feed-forward computations performed by the deep net are as follows: every input ai is multiplied by a set of fully-connected weights uij connecting the input layer to the hidden layer. The resulting weighted signals are then summed and combined with a bias ej. This results in the activation signal zj = ej + Pi aiuij. The hidden layer applies activation function g on zj resulting in the signal bj. In a similar fashion, the hidden layer activation signals bj are multiplied by the weights connecting the hidden layer to the output layer wjk, a bias fk is added and the resulting signal hk is transformed by the output activation function g to form the network output ck. The loss between the desired target tk and the output ck is given by the MSE: , where tk denotes the ground truth signal corresponding to ck. Training a neural network involves determining the set of parameters θ = {U,W,e,f} that minimize E. This problem can be solved using gradient descent, which requires determining for all θ in the model.

(b) We denote by the error signal of neuron k in the second linear layer of the network. Compute δk as a function of ck, tk, g0 and hk.

(e) We denote by the error signal of neuron j in the first linear layer of the network.
Compute ψj as a function of δk, wjk, g0 and zj.



