Deep policy iteration with integer programming for inventory management

Pavithra Harsha; Ashish Jagmohan; Jayant Kalagnanam; Brian Quanz; Divya Singhvi

Publication

MSOM 2022

Talk

Deep policy iteration with integer programming for inventory management

MSOM 2022

Abstract

In this work, we discuss Programmable Actor Reinforcement Learning (PARL), a policy iteration method that uses techniques from integer programming and sample average approximation. We numerically benchmark the algorithm in complex supply chain settings where optimal solution is intractable and show its performs comparable to, and sometimes better than, state-of-the-art RL and commonly used inventory management benchmarks.

Date

26 Jun 2022

Publication

MSOM 2022

Authors

IBM-affiliated at time of publication

Abstract

Date

Publication

Authors

Topics

Share