CS760 – HOMEWORK8 (Solution)

$ 20.99
Category:

Description

>>NAME HERE<< >>ID HERE<<
Instructions:
• Please submit your answers in a single pdf file. Preferably made using latex. No need to submit latex code.
• See piazza post @114 for some recommendations on how to write the answers.
• Submit code for programming exercises. You can use any programming language you like, but we highly recommend python and pytorch.
• Submit all the material on time.
1 Principal Component Analysis [60 pts]
Download three.txt and eight.txt. Each has 200 handwritten digits. Each line is for a digit, vectorized from a 16×16 gray scale image.
1. (10 pts) Each line has 256 numbers: they are pixel values (0=black, 255=white) vectorized from the image as the first column (top down), the second column, and so on. Visualize the two gray scale images corresponding to the first line in three.txt and the first line in eight.txt.
2. (10 pts) Putting the two data files together (threes first, eights next) to form a n × D matrix X where n = 400 digits and D = 256 pixels. Note we use n × D size for X instead of D × n to be consistent with the convention in linear regression. The ith row of X is x⊤i , where xi ∈RD is the ith image in the combined data set. Compute the sample mean . Visualize y as a 16×16 gray scale image.
3. (10 pts) Center X using y above. Then form the sample covariance matrix . Show the 5×5 submatrix S(1…5,1…5).
4. (10 pts) Use appropriate software to compute the two largest eigenvalues λ1 ≥ λ2 and the corresponding eigenvectors v1,v2 of S. For example, in Matlab one can use eigs(S,2). Show the value of λ1,λ2. Visualize v1,v2 as two 16×16 gray scale images. Hint: their elements will not be in [0, 255], but you can shift and scale them appropriately. It is best if you can show an accompany “colorbar” that maps gray scale to values.
5. (10 pts) Now we project (the centered) X down to the two PCA directions. Let V = [v1v2] be the D × 2 matrix. The projection is simply XV . Show the resulting two coordinates for the first line in three.txt and the first line in eight.txt, respectively.
6. (10 pts) Now plot the 2D point cloud of the 400 digits after projection. For visual interest, color points in three.txt red and points in eight.txt blue. But keep in mind that PCA is an unsupervised learning method and it does not know such class labels.
2 Q-learning [40 pts]
Consider the following Markov Decision Process. It has two states s. It has two actions a: move and stay. The state transition is deterministic: “move” moves to the other state, while “stay’ stays at the current state. The reward r is 0 for move, 1 for stay. There is a discounting factor γ = 0.8.
1
Homework 8 CS 760 Machine Learning

0

+1 +1
The reinforcement learning agent performs Q-learning. Recall the Q table has entries Q(s,a). The Q table is initialized with all zeros. The agent starts in state s1 = A. In any state st, the agent chooses the action at according to a behavior policy at = πB(st). Upon experiencing the next state and reward st+1,rt the update is:
.
Let the step size parameter α = 0.5.
1. (10 pts) Run Q-learning for 200 steps with a deterministic greedy behavior policy: at each state st use the best action at ∈ argmaxa Q(st,a) indicated by the current Q table. If there is a tie, prefer move. Show the Q table at the end.
2. (10 pts) Reset and repeat the above, but with an ϵ-greedy behavior policy: at each state st, with probability 1 − ϵ choose what the current Q table says is the best action: argmaxa Q(st,a); Break ties arbitrarily. Otherwise (with probability ϵ) uniformly chooses between move and stay (move or stay both with 1/2 probability). Use ϵ = 0.5.
3. (20 pts) Without doing simulation, use Bellman equation to derive the true Q table induced by the MDP.
2

Reviews

There are no reviews yet.

Be the first to review “CS760 – HOMEWORK8 (Solution)”

Your email address will not be published. Required fields are marked *