Distributional reinforcement learning for inventory management in multi-echelon supply chains

Received: 10 Jan 2023, Revised: 20 jan 2023, Accepted: 10 feb 2023, Available online: 15 april 2023, Version of Record: 30 april 2023

Guoquan Wu a
,
Miguel Ángel de Carvalho Servia b
,
Max Mowbray c
a
Department of Chemical and Biomolecular Engineering, National University of Singapore, 117585, Singapore
b
Department of Chemical Engineering, Imperial College London, South Kensington, London, SW7 2AZ, United Kingdom
c
Centre for Process Integration, Department of Chemical Engineering, The University of Manchester, Manchester, M13 9PL, United Kingdom

Abstract


Reinforcement Learning (RL) is an effective method to solve stochastic sequential decision-making problems. This is a problem description common to supply chain operations, however, most RL algorithms are tailored for game-based benchmarks. Here, we propose a deep RL method tailored for supply chain problems. The proposed algorithm deploys a derivative free approach to balance exploration and exploitation of the neural policy’s parameter space, providing means to avoid low quality local optima. Furthermore, the method allows consideration of risk-sensitive formulations to learn a policy that optimizes, for example, the conditional value-at-risk. The capabilities of our algorithm are tested on a multi-echelon supply chain problem, and several combinatorial optimization problems. The results empirically demonstrate the method’s improved sample efficiency compared to the benchmark algorithm proximal policy optimization, and superior performance to shrinking horizon mixed integer formulations. Additionally, its risk-sensitive policy can offer protection from low probability, high severity scenarios. Finally, we provide a sensitivity analysis for technical intuition.
Graphical abstract

  1. Download: Download high-res image (169KB)
  2. Download: Download full-size image
Keywords
Distributional reinforcement learning
Optimal control
Inventory management
Multi-echelon supply chains
Machine learning



Description



   

Indexed in scopus

https://www.scopus.com/authid/detail.uri?authorId=55483767200
      

Article metrics

10.31763/DSJ.v5i1.1674 Abstract views : | PDF views :

   

Cite

   

Full Text

Download

Conflict of interest


“Authors state no conflict of interest”


Funding Information


This research received no external funding or grants


Peer review:


Peer review under responsibility of Defence Science Journal


Ethics approval:


Not applicable.


Consent for publication:


Not applicable.


Acknowledgements:


None.