Double DQN

CellStrat > Research/Blog > Double DQN

Apr

This post assumes that you have a strong understanding of the basics of Reinforcement Learning, MDP, DQN and Policy Gradient Algorithms. You can go through Policy Gradients to understand the derivation for Stochastic Policies In the previous post on Actor Critic, we saw the advantage of merging Value based and Policy based methods together. The […]

Posted in: Reinforcement Learning, Robotics,

Tags: Actor Critic, Actor Critic method, Bellman equation, DDPG, Deep Deterministic Policy Gradients, deterministic policy, Double DQN, DQN, Experience Replay, exploration vs exploitation, Fixed D Targets, policy gradients, Q learning, TD3 RL, Twin Delayed Double Deterministic Policy Gradients,