An Efficient Node Selection Policy for Monte Carlo Tree Search with Neural Networks

The paper has been accepted to appear by the INFORMS Journal on Computing

The paper has been accepted to appear by the INFORMS Journal on Computing

41 Pages Posted: 23 May 2023 Last revised: 6 Sep 2023

See all articles by Xiaotian Liu

Xiaotian Liu

Georgia Institute of Technology

Yijie Peng

Peking University

Gongbo Zhang

Peking University - Guanghua School of Management

Ruihan Zhou

Peking University - Guanghua School of Management

Date Written: May 17, 2023

Abstract

Monte Carlo Tree Search (MCTS) has been gaining increasing popularity, and the success of AlphaGo has prompted a new trend of incorporating a value network and a policy network constructed with neural networks into MCTS, namely NN-MCTS. In this work, motivated by the shortcomings of the widely used Upper Confidence Bounds applied to Trees (UCT) policy, we formulate the node selection problem in NN-MCTS as a multi-stage Ranking and Selection (R&S) problem and propose a node selection policy that efficiently allocates a limited search budget to maximize the probability of correctly selecting the best action at the root state. The value network and policy network in NN-MCTS further improve the performance of the proposed node selection policy by providing prior knowledge and guiding the selection of the final action, respectively. Numerical experiments on two board games and an OpenAI task demonstrate that the proposed method outperforms the UCT policy used in AlphaGo Zero and MuZero, implying the potential of constructing node selection policies in NN-MCTS with R&S procedures.

Keywords: Monte Carlo Tree Search, Node Selection Policy, Neural Networks, Ranking and Selection

Suggested Citation

Liu, Xiaotian and Peng, Yijie and Zhang, Gongbo and Zhou, Ruihan, An Efficient Node Selection Policy for Monte Carlo Tree Search with Neural Networks (May 17, 2023). The paper has been accepted to appear by the INFORMS Journal on Computing, The paper has been accepted to appear by the INFORMS Journal on Computing, Available at SSRN: https://ssrn.com/abstract=4450999 or http://dx.doi.org/10.2139/ssrn.4450999

Xiaotian Liu

Georgia Institute of Technology ( email )

Atlanta, GA 30332
United States

Yijie Peng (Contact Author)

Peking University ( email )

No 5 Yiheyuan Rd
Haidian District
Beijing, Beijing 100871
China

Gongbo Zhang

Peking University - Guanghua School of Management ( email )

Ruihan Zhou

Peking University - Guanghua School of Management ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
289
Abstract Views
764
Rank
228,620
PlumX Metrics