RL – Reinforcement Learning Assignment 4 Solved

$ 29.99
Category:

Description

1 Introduction
The goal of this assignment is to do experiment with deep Q network (DQN), which combines the advantage of Q-learning and the neural network. In classical Q-learning methods, the action value function Q is intractable with the increase of state space and action space. DQN introduces the success of deep learning and has achieved a super-human level of play in atari games. Your goal is to implement the DQN algorithm and its improved algorithm and play with them in some classical RL control scenarios.
2 Deep Q-learning
Algorithm 1: deep Q-learning with experience replay. Initialize replay memory D to capacity N
Initialize action-value function Q with random weights h
Initialize target action-value function Q^ with weights h25h
For episode 5 1, M do
Initialize sequence s1~f gx1 and preprocessed sequence w1~wรฐ รžs1
For t5 1,T do
With probability e select a random action at otherwise select at~argmaxaQรฐwรฐ รžst ,a;hรž
Execute action at in emulator and observe reward rt and image xt11
Set stz1~st,at,xtz1 and preprocess wtz1~wรฐstz1รž
Store transition wt,at,rt,wtz1 in D
Sample random minibatch of transitions wj,aj,rj,wjz1 from D
rj if episode terminates at step jz1 Setyj~rjzc maxa0 Q^ wjz1,a0;h{ otherwise
2
Perform a gradient descent step on yj{Q wj,aj;h with respect to the network parameters h
Every C steps reset Q^~Q
End For
End For
Figure 1: Deep Q-learning with experience replay
You can refer to the original paper for the details of the DQN. โ€œHuman-level control through deep reinforcement learning.โ€ Nature 518.7540 (2015): 529.
3 Experiment Description
โ€ข Programming language: python3
โ€ข You should compare the performance of DQN and one kind of improved DQN and test them in a classical RL control environmentโ€“MountainCar.
OPENAI gym provides this environment, which is implemented with python (https://gym.openai.com/envs/MountainCar-v0/). Whatโ€™s more, gym also provides other more complex environment like atari games and mujoco.
Since the state is abstracted into carโ€™s position, convolutional layer is not necessary in our experiment. You can get started with OPENAI gym refer to this link (https://gym.openai.com/docs/). Note that it is suggested to implement your neural network on the Tensorflow or Pytorch.
4 Report and Submission
โ€ข Your report and source code should be compressed and named after โ€œstudentID+name+assignment4โ€.

Reviews

There are no reviews yet.

Be the first to review “RL – Reinforcement Learning Assignment 4 Solved”

Your email address will not be published. Required fields are marked *