Reinforcement Learning

AI content

CellStrat > Research/Blog > Artificial Intelligence > Reinforcement Learning

Jul

Deep Deterministic Policy Gradients for Supply Chain Optimization

Author : Shubha Manikarnike Submitted on : 5 Dec 2020 Abstract – This paper is a review of the paper “Reinforcement Learning for Supply Chain Optimization” (https://www.researchgate.net/publication/328676423_Reinforcement_learning_for_supply_chain_optimization) and helps us understand how to model the supply chain environment as a Reinforcement Learning problem and how to model the DDPG algorithm can be used to successfully […]

Posted in: papers, Reinforcement Learning, Retail,

Aug

AlphaZero with Monte Carlo Tree Search

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #WhereLearningNeverStops In recent weeks, I had presented a session on “AlphaZero with Monte Carlo Tree Search” algorithm at the CellStrat AI Lab. This is an algorithm developed by Google Deepmind in 2016. It mastered the game of GO and beat the 18-time world champion at the time Lee Sedol. Go is an ancient Chinese abstract strategy […]

Posted in: Gaming, Reinforcement Learning, Robotics,

Tags: Actor Critic, AlphaGO, AlphaZero, deepmind, Game Tree, MCTS, Monte Carlo Tree Search, policy evaluation, policy improvement, policy iteration, reinforcement learning, UCT, Upper Confidence Bound,

Aug

Hierarchical Text Generation and Planning for Strategic Dialogue using RL

Introduction Moving up the value chain CellStrat would like to encourage discussions and webinars focusing on the application of AI in Real Life problem-solving. A beginning has already been made and this is another step in that direction. The use of Deep Learning and Reinforcement Learning to solve a complex Strategic Negotiation is a very […]

Posted in: Artificial General Intelligence (AGI), Artificial Intelligence, Natural Language Processing, Reinforcement Learning, Retail,

Jul

Metric-based Meta Learning

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #WhereLearningNeverStops Recently, I presented a session on Metric-based Meta Learning at the CellStrat AI Lab (where I am an AI Researcher). Metric-based Meta Learning might be considered a domain of Artificial General Intelligence (AGI). This is due to the fact that Meta Learning helps us create generalized systems with relatively less data. […]

Posted in: Artificial General Intelligence (AGI), Reinforcement Learning,

Tags: AGI, Artificial General Intelligence, DRL, Few shot classification, Full Context Embedding, Matching Network, Meta Learning, Meta Training, Metric-based Meta Learning, Model-based Meta Learning, Omniglot, Optimization-based Meta Learning, Prototypical Network, reinforcement learning, Relation Network, siamese network, Zero Shot Learning,

Jun

Multi Agent Reinforcement Learning

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars I presented a session on Multi-Agent RL recently at the CellStrat AI Lab. Introduction :- In the normal Reinforcement Learning setup, you have one agent which interacts with the environment. It uses the Observation from the environment, performs actions and observes the rewards. In real life, many applications will involve several agents […]

Posted in: Reinforcement Learning,

Tags: Actor Critic, DDPG, Deep Deterministic Policy Gradients, MADDPG, Markov Games, Multi Agent Actor Critic, Multi Agent DDPG, Multi Agent RL, Policy Ensembles, policy gradients, Policy Inference, reinforcement learning, Unity ML,

Apr

Temporal Difference methods in RL

This post discusses temporal difference (TD) methods, used in Reinforcement Learning. It contrasts TD methods with Monte Carlo (MC) methods and dynamic programming. You need to have a thorough understanding of Markov Decision Process (MDP) to understand this post. Prediction and Control : In general, RL methods have two components 1) Prediction / Evaluation : where […]

Posted in: Reinforcement Learning, Robotics,

Tags: epsilon greedy technique, exploration vs exploitation, On-policy vs off-policy, policy improvement, policy iteration, Q learning, reinforcement learning, SARSA, SARSAMax, TD control, TD Learning, Temporal Difference, value iteration,

Apr

DDPG and TD3

This post assumes that you have a strong understanding of the basics of Reinforcement Learning, MDP, DQN and Policy Gradient Algorithms. You can go through Policy Gradients to understand the derivation for Stochastic Policies In the previous post on Actor Critic, we saw the advantage of merging Value based and Policy based methods together. The […]

Posted in: Reinforcement Learning, Robotics,

Tags: Actor Critic, Actor Critic method, Bellman equation, DDPG, Deep Deterministic Policy Gradients, deterministic policy, Double DQN, DQN, Experience Replay, exploration vs exploitation, Fixed D Targets, policy gradients, Q learning, TD3 RL, Twin Delayed Double Deterministic Policy Gradients,

Apr

A Summary of Model-Free RL Algorithms

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling Reinforcement Learning (RL) refers to training agents with help of incentive-driven environments. RL typically involves a tuple of <state, action, reward> paradigm, which means that the agent has action choices to make in various states, and each action entails a potential reward. This also means that each state has a “value” […]

Posted in: Reinforcement Learning, Robotics,

Tags: Actor Critic, Actor Critic method, AI lab, cartpole, DDPG, deep Q learning, deep Q network, Deep Reinforcement Learning, deterministic policy, DQN, DRL course, gaming, markov decision process, markov process, Markov Reward Process, mdp, model-based RL, model-free RL, monte carlo, off-policy, on-policy, policy based methods, policy gradients, PPO, Q learning, Q table, Rainbow method, Reinforcement learning course, reward function, RL course, RL for gaming, RL training, SAC, stochastic policy, TD Learning, TD3, training in reinforcement learning, TRPO, Value-based methods,

Mar

RL with Actor-Critic Methods

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling Minutes from Saturday 14th March 2020 AI Lab Workshop at BLR :- Session Presenter : SHUBHA M., Deep Reinforcement Learning Researcher, CellStrat AI Lab Last Saturday, our Reinforcement Learning Team Lead Shubha M. presented a fantastic presentation and workshop on Actor-Critic method used in RL. She also demonstrated a demo of this technique for Stock Market predictions. […]

Posted in: Reinforcement Learning, Robotics,

Tags: Actor Critic, deep Q learning, Deep Reinforcement Learning, DRL, markov decision process, Markov Reward Process, mdp, MRP, policy based methods, policy gradients, Q Actor Critic, Q learning, REINFORCE algorithm, reinforcement learning, Value-based methods,