Q learning

CellStrat > Research/Blog > Q learning

Apr

This post discusses temporal difference (TD) methods, used in Reinforcement Learning. It contrasts TD methods with Monte Carlo (MC) methods and dynamic programming. You need to have a thorough understanding of Markov Decision Process (MDP) to understand this post. Prediction and Control : In general, RL methods have two components 1) Prediction / Evaluation : where […]

Posted in: Reinforcement Learning, Robotics,

Tags: epsilon greedy technique, exploration vs exploitation, On-policy vs off-policy, policy improvement, policy iteration, Q learning, reinforcement learning, SARSA, SARSAMax, TD control, TD Learning, Temporal Difference, value iteration,

Apr

DDPG and TD3

This post assumes that you have a strong understanding of the basics of Reinforcement Learning, MDP, DQN and Policy Gradient Algorithms. You can go through Policy Gradients to understand the derivation for Stochastic Policies In the previous post on Actor Critic, we saw the advantage of merging Value based and Policy based methods together. The […]

Posted in: Reinforcement Learning, Robotics,

Tags: Actor Critic, Actor Critic method, Bellman equation, DDPG, Deep Deterministic Policy Gradients, deterministic policy, Double DQN, DQN, Experience Replay, exploration vs exploitation, Fixed D Targets, policy gradients, Q learning, TD3 RL, Twin Delayed Double Deterministic Policy Gradients,

Apr

A Summary of Model-Free RL Algorithms

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling Reinforcement Learning (RL) refers to training agents with help of incentive-driven environments. RL typically involves a tuple of <state, action, reward> paradigm, which means that the agent has action choices to make in various states, and each action entails a potential reward. This also means that each state has a “value” […]

Posted in: Reinforcement Learning, Robotics,

Tags: Actor Critic, Actor Critic method, AI lab, cartpole, DDPG, deep Q learning, deep Q network, Deep Reinforcement Learning, deterministic policy, DQN, DRL course, gaming, markov decision process, markov process, Markov Reward Process, mdp, model-based RL, model-free RL, monte carlo, off-policy, on-policy, policy based methods, policy gradients, PPO, Q learning, Q table, Rainbow method, Reinforcement learning course, reward function, RL course, RL for gaming, RL training, SAC, stochastic policy, TD Learning, TD3, training in reinforcement learning, TRPO, Value-based methods,

Mar

RL with Actor-Critic Methods

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling Minutes from Saturday 14th March 2020 AI Lab Workshop at BLR :- Session Presenter : SHUBHA M., Deep Reinforcement Learning Researcher, CellStrat AI Lab Last Saturday, our Reinforcement Learning Team Lead Shubha M. presented a fantastic presentation and workshop on Actor-Critic method used in RL. She also demonstrated a demo of this technique for Stock Market predictions. […]

Posted in: Reinforcement Learning, Robotics,

Tags: Actor Critic, deep Q learning, Deep Reinforcement Learning, DRL, markov decision process, Markov Reward Process, mdp, MRP, policy based methods, policy gradients, Q Actor Critic, Q learning, REINFORCE algorithm, reinforcement learning, Value-based methods,

Feb

Policy Gradients – An Introduction

I conducted an Introductory session on Reinforcement Learning Policy Gradients (PG) at CellStrat AI Lab on 1st Feb 2020. The goal of this session was to explain the basic underlying principle of Policy Gradients. The session started off with a quick recap of Reinforcement Learning, so that the audience is well aware of the definitions […]

Posted in: Reinforcement Learning,

Tags: Acrobot, deep Q network, deepmind, DQN, experience replay buffer, fixed Q targets, gradient ascent, openAI Gym, policy based methods, policy gradients, Q learning, reinforcement learning, reward function, rewards based learning, value iteration algorithm,

Oct

Meeting Minutes from AI Lab session on Saturday 12th Oct in Bengaluru

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars Last Saturday saw some amazing sessions on advanced AI at the CellStrat AI Lab meetup. Diabetes prediction with Machine Learning :- First came a superb presentation by Dr Purnendu Das on diabetes prediction using ML. Dr Das started by discussing the data sources for healthcare and clinical data analysis. He covered the […]

Posted in: Computer Vision, Deep Learning, Healthcare, Machine Learning, Reinforcement Learning,

Tags: deep Q learning, deep Q network, diabetes mellitus, diabetes prediction, DQN, Healthcare, healthcare predictions, Q learning, reinforcement learning, style art, style transfer,

Aug

Meeting Minutes from AI Lab Hands-On Workshop on Saturday 24th Aug in Bengaluru

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars Last Saturday, our team lead for Reinforcement Learning (RL) Shubha Manikarnike presented a fabulous hands-on workshop on RL and it’s various algorithms such as Markov Decision Process (MDP), Policy Gradients, Bellman equation, Q-learning etc. The session started with an Introduction to RL. There was a comparison on how this is different from […]

Posted in: Deep Learning, Reinforcement Learning,

Tags: Bellman equation, epsilon greedy technique, exploration vs exploitation, Frozen Lake, markov decision process, openAI Gym, Q learning, reinforcement learning, rewards based learning,