RL
12
Mar
Proximal Policy Optimization
In my previous post, we discussed the simplest Policy Gradient REINFORCE. We saw, how Policy based methods are better than value based methods, a derivation of the Gradient of Score(Cost) function, and an implementation of simple Policy Gradient to train Gym’s Acrobot-v0. We then saw, how introducing a baseline reduces variance which leads to the […]
28
May
MEETING MINUTES FROM SATURDAY 25TH. MAY – AI LAB SESSION IN BENGALURU
#CellStratAILab #disrupt4.0 #WeCreateAISuperstars We had another round of deep AI sessions last Saturday in BLR AI Lab meetup. I started with a detailed deep-dive on the Maximum Likelihood Estimation (MLE) algorithm for Logistic Regression, which is a type of classification technique. Logistic Regression has a concave optimization curve wherein we try to maximize the Likelihood […]