ABSTRACT
Professional sports is a roughly $500 billion dollar industry that is increasingly data-driven. In this paper we show how machine learning can be applied to generate a model that could lead to better on-field decisions by managers of professional baseball teams. Specifically we show how to use regularized linear regression to learn pitcher-specific predictive models that can be used to help decide when a starting pitcher should be replaced. A key step in the process is our method of converting categorical variables (e.g., the venue in which a game is played) into continuous variables suitable for the regression. Another key step is dealing with situations in which there is an insufficient amount of data to compute measures such as the effectiveness of a pitcher against specific batters.
For each season we trained on the first 80% of the games, and tested on the rest. The results suggest that using our model could have led to better decisions than those made by major league managers. Applying our model would have led to a different decision 48% of the time. For those games in which a manager left a pitcher in that our model would have removed, the pitcher ended up performing poorly 60% of the time.
- J. Albert. Pitching statistics, talent and luck, and the best strikeout seasons of all-time. Journal of Quantitative Analysis in Sports, 2(1):2, 2006.Google ScholarCross Ref
- G. Baseball. The pitching rotation and the bullpen@ONLINE. http://www.howbaseballworks.com/RotationandBullpen.htm, Jan. 2013.Google Scholar
- B. Baumer and S. Ben. Why on-base percentage is a better indicator of future performance than batting average: An algebraic proof. Journal of Quantitative Analysis in Sports, 4(2):3, 2008.Google ScholarCross Ref
- R. M. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on, pages 43--52. IEEE, 2007. Google ScholarDigital Library
- T. Evgeniou and M. Pontil. Regularized multi--task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 109--117. ACM, 2004. Google ScholarDigital Library
- G. Ganeshapillai and J. Guttag. Predicting the next pitch. Sloan Sports Analytics Conference, 2012.Google Scholar
- A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55--67, 1970.Google ScholarCross Ref
- G. Huckabay. 6--4--3: Reasonable person standard@ONLINE. http://www.baseballprospectus.com/article.php?articleid=1581, Aug. 2002.Google Scholar
- Imaginesports. Glossary@ONLINE. http://imaginesports.com/bball/reference/glossary/popup, Jan. 2013.Google Scholar
- B. James. Whatever happened to the Hall of Fame. Free Press, 1995.Google Scholar
- J. Keri and B. Prospectus. Baseball Between the Numbers: Why Everything You Know about the Game Is Wrong. Basic Books, 2007.Google Scholar
- B. Prospectus. Baseball Prospectus 2004. Wiley, 2004.Google Scholar
- B. Prospectus. Baseball Prospectus 2011. Wiley, 2011.Google Scholar
- N. Silver. Introducing pecota. Baseball Prospectus, 2003:507--514, 2003.Google Scholar
- N. Silver. The Signal and the Noise: Why So Many Predictions Fail-but Some Don't. Penguin Group US, 2012.Google Scholar
- S. Sullivan. State of the art: The actuarial game of baseball@ONLINE. http://www.contingencies.org/mayjun04/stat.pdf, June 2004.Google Scholar
- F. Zimniuch and L. Smith. Fireman: The Evolution of the Closer in Baseball. Triumph Books, 2010.Google Scholar
Index Terms
- A data-driven method for in-game decision making in MLB: when to pull a starting pitcher
Recommendations
Using Machine Learning to Predict Salaries of Major League Baseball Players
Advances and Trends in Artificial Intelligence. From Theory to PracticeAbstractMajor League Baseball is one of the most watched sports in the world. In recent years, in addition to focusing on the performance of a player and his team, a player’s salary has also been a focus of fan discussion, always generating discussion and ...
A Role-Switching Mechanic for Reflective Decision-Making Game
Entertainment Computing - ICEC 2015AbstractThis paper introduces issues about a methodology for the design of serious games that help players/learners understand their decision-making process. First, we discuss the development of a video game system based on a role-switching mechanic where ...
Data-driven method for mobile game publishing revenue forecast
AbstractGames as a service is similar to software as a service, which provides players with game content on a continuous monetization model. Game revenue forecast is vital to game developers to make the right business decisions, such as determining the ...
Comments