Actor Critic method
20
Apr
DDPG and TD3
This post assumes that you have a strong understanding of the basics of Reinforcement Learning, MDP, DQN and Policy Gradient Algorithms. You can go through Policy Gradients to understand the derivation for Stochastic Policies In the previous post on Actor Critic, we saw the advantage of merging Value based and Policy based methods together. The […]
13
Apr
A Summary of Model-Free RL Algorithms
#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling Reinforcement Learning (RL) refers to training agents with help of incentive-driven environments. RL typically involves a tuple of <state, action, reward> paradigm, which means that the agent has action choices to make in various states, and each action entails a potential reward. This also means that each state has a “value” […]
Tags:
Actor Critic,
Actor Critic method,
AI lab,
cartpole,
DDPG,
deep Q learning,
deep Q network,
Deep Reinforcement Learning,
deterministic policy,
DQN,
DRL course,
gaming,
markov decision process,
markov process,
Markov Reward Process,
mdp,
model-based RL,
model-free RL,
monte carlo,
off-policy,
on-policy,
policy based methods,
policy gradients,
PPO,
Q learning,
Q table,
Rainbow method,
Reinforcement learning course,
reward function,
RL course,
RL for gaming,
RL training,
SAC,
stochastic policy,
TD Learning,
TD3,
training in reinforcement learning,
TRPO,
Value-based methods,