Description
1 Introduction
The goal of this assignment is to do experiment with deep Q network (DQN), which combines the advantage of Q-learning and the neural network. In classical Q-learning methods, the action value function Q is intractable with the increase of state space and action space. DQN introduces the success of deep learning and has achieved a super-human level of play in atari games. Your goal is to implement the DQN algorithm and its improved algorithm and play with them in some classical RL control scenarios.
2 Deep Q-learning
Algorithm 1: deep Q-learning with experience replay. Initialize replay memory D to capacity N
Initialize action-value function Q with random weights h
Initialize target action-value function Q^ with weights h25h
For episode 5 1, M do
Initialize sequence s1~f gx1 and preprocessed sequence w1~wð Þs1
For t5 1,T do
With probability e select a random action at otherwise select at~argmaxaQðwð Þst ,a;hÞ
Execute action at in emulator and observe reward rt and image xt11
Set stz1~st,at,xtz1 and preprocess wtz1~wðstz1Þ
Store transition wt,at,rt,wtz1 in D
Sample random minibatch of transitions wj,aj,rj,wjz1 from D
rj if episode terminates at step jz1 Setyj~rjzc maxa0 Q^ wjz1,a0;h{ otherwise
2
Perform a gradient descent step on yj{Q wj,aj;h with respect to the network parameters h
Every C steps reset Q^~Q
End For
End For
Figure 1: Deep Q-learning with experience replay
You can refer to the original paper for the details of the DQN. “Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529.
3 Experiment Description
• Programming language: python3
• You should compare the performance of DQN and one kind of improved DQN and test them in a classical RL control environment–MountainCar.
OPENAI gym provides this environment, which is implemented with python (https://gym.openai.com/envs/MountainCar-v0/). What’s more, gym also provides other more complex environment like atari games and mujoco.
Since the state is abstracted into car’s position, convolutional layer is not necessary in our experiment. You can get started with OPENAI gym refer to this link (https://gym.openai.com/docs/). Note that it is suggested to implement your neural network on the Tensorflow or Pytorch.
4 Report and Submission
• Your report and source code should be compressed and named after “studentID+name+assignment4”.
Reviews
There are no reviews yet.