Military Decision Support with Actor and Critic Reinforcement Learning Agents.

Received: 16 May 2024, Revised: 25 May 2024, Accepted: 07 Aug 2024, Available online: 18 Aug 2024, Version of Record: 18 Aug 2024

Jungmok Ma

Abstract


While the recent advanced military operational concept requires intelligent support of command and control, Reinforcement Learning (RL) has not been actively studied in the military domain. This study points out the limitations of RL for military applications from a literature review and aims to improve the understanding of RL for military decision support under these limitations. Most of all, the black box characteristic of Deep RL makes the internal process difficult to understand, in addition to the complex simulation tools. A scalable weapon selection RL framework is built, which can be solved either by a tabular form or a neural network form. The transition of the Deep Q-Network (DQN) solution to the tabular form allows for effective comparison of the results to the Q-learning solution. Furthermore, rather than using one or two RL models selectively as before, RL models are divided into an actor and a critic, and systematically compared. A random agent, Q-learning and DQN agents as critics, a Policy Gradient (PG) agent as an actor, Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) agents as an actor-critic approach are designed, trained, and tested. The performance results show that the trained DQN and PPO agents are the best decision support candidates for the weapon selection RL framework.
Subjects
REINFORCEMENT learningCRITICS



Description



   

Indexed in scopus

https://openurl.ebsco.com/EPDB%3Agcd%3A8%3A28280872/detailv2?sid=ebsco%3Aplink%3Aresult-item&id=ebsco%3Adoi%3A10.14429%2Fdsj.74.18864&bquery=Defence%20Science%20Journal&page=2&link_origin=www.google.com
      

Article metrics

10.31763/DSJ.v5i1.1674 Abstract views : | PDF views :

   

Cite

   

Full Text

Download

Conflict of interest


“Authors state no conflict of interest”


Funding Information


This research received no external funding or grants


Peer review:


Peer review under responsibility of Defence Science Journal


Ethics approval:


Not applicable.


Consent for publication:


Not applicable.


Acknowledgements:


None.