reinforcement learning

CellStrat > Research/Blog > reinforcement learning

Oct

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #WhereLearningNeverStops Recently, I presented an extensive session on Logic and Reasoning at the CellStrat AI Lab. This topic comes under the broader area of Artificial General Intelligence or AGI. This workshop included the following topics :- Symbolic AI Propositional Logic First Order Logic Program Synthesis Relation Network Symbolic AI Although, Deep Learning […]

Posted in: Artificial General Intelligence (AGI),

Tags: bAbl, CLEVR, domain specific language, Dynamic Physical Systems, First Order Logic, Flash Fill, FOL, genetic algorithms, Logic, Probabilistic logic, program synthesis, Propositional Logic, Prose, Reasoning, reinforcement learning, Relation Network, Scene Understanding, search heuristics, Social Network Modelling, Sort of CLEVR, Symbolic AI, Visual Question Answering,

Aug

AlphaZero with Monte Carlo Tree Search

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #WhereLearningNeverStops In recent weeks, I had presented a session on “AlphaZero with Monte Carlo Tree Search” algorithm at the CellStrat AI Lab. This is an algorithm developed by Google Deepmind in 2016. It mastered the game of GO and beat the 18-time world champion at the time Lee Sedol. Go is an ancient Chinese abstract strategy […]

Posted in: Gaming, Reinforcement Learning, Robotics,

Tags: Actor Critic, AlphaGO, AlphaZero, deepmind, Game Tree, MCTS, Monte Carlo Tree Search, policy evaluation, policy improvement, policy iteration, reinforcement learning, UCT, Upper Confidence Bound,

Aug

Optimization-Based Meta Learning

Recently I presented a session on Optimization-based Meta Learning, Part 3 of the Meta Learning Series, at the CellStrat AI Lab. The previous parts are found here – Part 1 (Metric-based Meta Learning), Part 2 (Model-based Meta Learning) Meta Learning, of course, refers to “Learning to Learn“. Quick Recap (Metric-based Meta Learning) Meta-Learning deals with […]

Posted in: Artificial General Intelligence (AGI),

Tags: fast weights, first order MAML, First-order Meta Learning, FOMAML, MAML, Meta Learning, Metric-based Meta Learning, Model-agnostic meta learner, Model-based Meta Learning, Optimization-based Meta Learning, reinforcement learning, Reptile, transfer learning,

Jul

Metric-based Meta Learning

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #WhereLearningNeverStops Recently, I presented a session on Metric-based Meta Learning at the CellStrat AI Lab (where I am an AI Researcher). Metric-based Meta Learning might be considered a domain of Artificial General Intelligence (AGI). This is due to the fact that Meta Learning helps us create generalized systems with relatively less data. […]

Posted in: Artificial General Intelligence (AGI), Reinforcement Learning,

Tags: AGI, Artificial General Intelligence, DRL, Few shot classification, Full Context Embedding, Matching Network, Meta Learning, Meta Training, Metric-based Meta Learning, Model-based Meta Learning, Omniglot, Optimization-based Meta Learning, Prototypical Network, reinforcement learning, Relation Network, siamese network, Zero Shot Learning,

Jun

Multi Agent Reinforcement Learning

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars I presented a session on Multi-Agent RL recently at the CellStrat AI Lab. Introduction :- In the normal Reinforcement Learning setup, you have one agent which interacts with the environment. It uses the Observation from the environment, performs actions and observes the rewards. In real life, many applications will involve several agents […]

Posted in: Reinforcement Learning,

Tags: Actor Critic, DDPG, Deep Deterministic Policy Gradients, MADDPG, Markov Games, Multi Agent Actor Critic, Multi Agent DDPG, Multi Agent RL, Policy Ensembles, policy gradients, Policy Inference, reinforcement learning, Unity ML,

Apr

Temporal Difference methods in RL

This post discusses temporal difference (TD) methods, used in Reinforcement Learning. It contrasts TD methods with Monte Carlo (MC) methods and dynamic programming. You need to have a thorough understanding of Markov Decision Process (MDP) to understand this post. Prediction and Control : In general, RL methods have two components 1) Prediction / Evaluation : where […]

Posted in: Reinforcement Learning, Robotics,

Tags: epsilon greedy technique, exploration vs exploitation, On-policy vs off-policy, policy improvement, policy iteration, Q learning, reinforcement learning, SARSA, SARSAMax, TD control, TD Learning, Temporal Difference, value iteration,

Apr

AI/ML solutions for COVID-19 Pandemic

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling Last Saturday (4th Apr ’20), CellStrat AI Lab conducted a Global Code Jam for AI/ML solutions for the current COVID-19 pandemic, which has brought a great deal of issues in the world. With the Code Jam, we tried to solve some of the COVID-19 problems with help of AI / ML […]

Posted in: COVID-19, Infectious Diseases,

Tags: Chest X-Ray, community surveillance, coronavirus, COVID-19, epidemiological, image segmentation, infectious diseases, molecule synthesis, object detection, pandemic, reinforcement learning, social distancing app, SQuaD, Time-series, VAE,

Mar

RL with Actor-Critic Methods

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling Minutes from Saturday 14th March 2020 AI Lab Workshop at BLR :- Session Presenter : SHUBHA M., Deep Reinforcement Learning Researcher, CellStrat AI Lab Last Saturday, our Reinforcement Learning Team Lead Shubha M. presented a fantastic presentation and workshop on Actor-Critic method used in RL. She also demonstrated a demo of this technique for Stock Market predictions. […]

Posted in: Reinforcement Learning, Robotics,

Tags: Actor Critic, deep Q learning, Deep Reinforcement Learning, DRL, markov decision process, Markov Reward Process, mdp, MRP, policy based methods, policy gradients, Q Actor Critic, Q learning, REINFORCE algorithm, reinforcement learning, Value-based methods,

Mar

Face Recognition with MTCNN and FaceNet; RL with Proximal Policy Optimization

#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling Minutes from Saturday 7th March 2020 AI Lab meetup at BLR :- Last Saturday, we had excellent sessions in the AI Lab meetup. Face Recognition with MTCNN and FaceNet :- First Amit Kumar presented a detailed overview of Face Recognition with MTCNN and FaceNet. Face Recognition involves a pipeline of Face […]

Posted in: Computer Vision, Deep Learning, Reinforcement Learning, Security,

Tags: Advantage Function, Face Recognition, facenet, Important Sampling, MTCNN, MTCNN face detector, O-Net, PG algo, policy based methods, policy gradients, PPO, proximal policy optimization, R-Net, reinforcement learning, Triplet Loss, Value-based methods, Vanilla Policy Gradient, VPG,

Mar

Proximal Policy Optimization

In my previous post, we discussed the simplest Policy Gradient REINFORCE. We saw, how Policy based methods are better than value based methods, a derivation of the Gradient of Score(Cost) function, and an implementation of simple Policy Gradient to train Gym’s Acrobot-v0. We then saw, how introducing a baseline reduces variance which leads to the […]

Posted in: Reinforcement Learning,

Tags: clipped PPO, Importance Sampling, PG, policy gradients, PPO, proximal policy optimization, reinforcement learning, RL, TRPO, trust region policy optimization,