Reinforcement Learning for Search Tree Size Minimization in Constraint Programming: New Results on Scheduling Benchmarks

55 Pages Posted: 27 Aug 2024

See all articles by Vilém Heinz

Vilém Heinz

Czech Technical University in Prague

Zdeněk Hanzálek

Czech Technical University in Prague

Petr Vilím

affiliation not provided to SSRN

Abstract

One of the significant search algorithms for scheduling using Constraint Programming (CP) is Failure-Directed Search (FDS), a generic complete search algorithm designed to efficiently explore the solution search space. Despite its genericity, FDS proved optimality or solved optimally numerous scheduling instances that were open for decades. In this paper, we focus on FDS's properties and show that minimizing the size of its search tree guided by branching decisions (choices) ordered by continuously updated ratings has the same properties as arm selection in the Multi-armed Bandit (MAB) problem. Since the exploration-exploitation dilemma, a key aspect of the MAB problem, was not originally considered in FDS, we apply various reinforcement learning algorithms for the MAB problem to FDS. We extend them with problem-specific improvements and finally enhance the performance of FDS by parameter tuning. For the performance evaluation, we use the Job Shop Scheduling Problem (JSSP) and Resource-Constrained Project Scheduling Problem (RCPSP), two of the most fundamental scheduling problems studied for almost 60 years. The results show the improved FDS implemented in a new solver called OptalCP is 1.7 times faster on JSSP and 2.5 times faster on RCPSP benchmarks compared to the original FDS. It is also 3.5 times faster on JSSP and 2.1 times faster on RCPSP than the current state-of-the-art FDS algorithm in IBM CP Optimizer 22.1. Another contribution is the improvement of existing state-of-the-art lower bounds for said problems. From all open instances in standard benchmark sets, 78 out of 84 JSSP and 226 out of 393 of RCPSP lower bounds were improved while using just a 900-second time limit per instance. A small number of those instances was also closed (optimal makespan was found).

Keywords: Constraint Programming, Reinforcement Learning, Discrete Optimization, Scheduling, Tree Search, Heuristics

Suggested Citation

Heinz, Vilém and Hanzálek, Zdeněk and Vilím, Petr, Reinforcement Learning for Search Tree Size Minimization in Constraint Programming: New Results on Scheduling Benchmarks. Available at SSRN: https://ssrn.com/abstract=4938242 or http://dx.doi.org/10.2139/ssrn.4938242

Vilém Heinz (Contact Author)

Czech Technical University in Prague ( email )

Thakurova 2077/7
Prague 6, 16629
Czech Republic

Zdeněk Hanzálek

Czech Technical University in Prague ( email )

Petr Vilím

affiliation not provided to SSRN ( email )

No Address Available

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
57
Abstract Views
136
Rank
721,164
PlumX Metrics