Research/Blog

CellStrat > Research/Blog > Artificial Intelligence > Reinforcement Learning > Deep Deterministic Policy Gradients for Supply Chain Optimization

Deep Deterministic Policy Gradients for Supply Chain Optimization

July 21, 2021
Posted by: vsinghal
Category: papers Reinforcement Learning Retail

No Comments

Author : Shubha Manikarnike

Submitted on : 5 Dec 2020

Abstract – This paper is a review of the paper “Reinforcement Learning for Supply Chain Optimization” (https://www.researchgate.net/publication/328676423_Reinforcement_learning_for_supply_chain_optimization) and helps us understand how to model the supply chain environment as a Reinforcement Learning problem and how to model the DDPG algorithm can be used to successfully solve the environment. The paper discusses modelling of the Supply Chain Environment as a Marko Decision Process – by defining the states, actions and rewards. We then look at how the Deep Deterministic Policy Gradient Algorithm can be used in a Supply Chain Environment. Although a wide range of traditional optimization methods are available for inventory and price management applications, deep reinforcement learning has the potential to substantially improve the optimization capabilities for these and other types of enterprise operations due to impressive recent advances in the development of generic self-learning algorithms for optimal control.

Keywords – Reinforcement Learning, Supply Chain Optimization, DDPG, Deep Deterministic Policy Gradient, Markov Decision Process

Download