Pitch Sequences in Baseball: Analysis Using a Probabilistic Topic Model
16 Pages Posted: 6 Jan 2021 Last revised: 16 Nov 2021
Date Written: November 11, 2020
Abstract
In baseball, the pitch sequence is one of the most important factors in winning or losing a game; a single pitch may even define the game outcome. Therefore, it is useful for both batteries and hitters to have a deep understanding of the pitch sequence trend based on a variety of factors such as pitcher/hitter characteristics, game situations, and hitting results. However, while statistical techniques and machine learning methods have been increasingly applied to baseball data in recent years, pitch sequencing remains an under-studied area in baseball theory. This study investigates the relationship between pitch sequences and pitcher/hitter characteristics, game situations, and hitting results. A probabilistic topic model is applied to pitch-by-pitch data of all regular season games of Nippon Professional Baseball (NPB) in the period 2016–2018. The obtained results show that the model effectively extracts the pitch sequence trend. The model successfully identifies the pitch sequences likely to be used against sluggers, the effective pitch sequences to strike out a hitter, and the pitch sequence trend of an individual pitcher and in a specific match-up.
Keywords: Probabilistic topic model, Baseball data, Pitch sequence
Suggested Citation: Suggested Citation