Double DQN
20
Apr
DDPG and TD3
This post assumes that you have a strong understanding of the basics of Reinforcement Learning, MDP, DQN and Policy Gradient Algorithms. You can go through Policy Gradients to understand the derivation for Stochastic Policies In the previous post on Actor Critic, we saw the advantage of merging Value based and Policy based methods together. The […]