skip to main content
10.1145/2487575.2487660acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
poster

A data-driven method for in-game decision making in MLB: when to pull a starting pitcher

Published:11 August 2013Publication History

ABSTRACT

Professional sports is a roughly $500 billion dollar industry that is increasingly data-driven. In this paper we show how machine learning can be applied to generate a model that could lead to better on-field decisions by managers of professional baseball teams. Specifically we show how to use regularized linear regression to learn pitcher-specific predictive models that can be used to help decide when a starting pitcher should be replaced. A key step in the process is our method of converting categorical variables (e.g., the venue in which a game is played) into continuous variables suitable for the regression. Another key step is dealing with situations in which there is an insufficient amount of data to compute measures such as the effectiveness of a pitcher against specific batters.

For each season we trained on the first 80% of the games, and tested on the rest. The results suggest that using our model could have led to better decisions than those made by major league managers. Applying our model would have led to a different decision 48% of the time. For those games in which a manager left a pitcher in that our model would have removed, the pitcher ended up performing poorly 60% of the time.

References

  1. J. Albert. Pitching statistics, talent and luck, and the best strikeout seasons of all-time. Journal of Quantitative Analysis in Sports, 2(1):2, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  2. G. Baseball. The pitching rotation and the bullpen@ONLINE. http://www.howbaseballworks.com/RotationandBullpen.htm, Jan. 2013.Google ScholarGoogle Scholar
  3. B. Baumer and S. Ben. Why on-base percentage is a better indicator of future performance than batting average: An algebraic proof. Journal of Quantitative Analysis in Sports, 4(2):3, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  4. R. M. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on, pages 43--52. IEEE, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Evgeniou and M. Pontil. Regularized multi--task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 109--117. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Ganeshapillai and J. Guttag. Predicting the next pitch. Sloan Sports Analytics Conference, 2012.Google ScholarGoogle Scholar
  7. A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55--67, 1970.Google ScholarGoogle ScholarCross RefCross Ref
  8. G. Huckabay. 6--4--3: Reasonable person standard@ONLINE. http://www.baseballprospectus.com/article.php?articleid=1581, Aug. 2002.Google ScholarGoogle Scholar
  9. Imaginesports. Glossary@ONLINE. http://imaginesports.com/bball/reference/glossary/popup, Jan. 2013.Google ScholarGoogle Scholar
  10. B. James. Whatever happened to the Hall of Fame. Free Press, 1995.Google ScholarGoogle Scholar
  11. J. Keri and B. Prospectus. Baseball Between the Numbers: Why Everything You Know about the Game Is Wrong. Basic Books, 2007.Google ScholarGoogle Scholar
  12. B. Prospectus. Baseball Prospectus 2004. Wiley, 2004.Google ScholarGoogle Scholar
  13. B. Prospectus. Baseball Prospectus 2011. Wiley, 2011.Google ScholarGoogle Scholar
  14. N. Silver. Introducing pecota. Baseball Prospectus, 2003:507--514, 2003.Google ScholarGoogle Scholar
  15. N. Silver. The Signal and the Noise: Why So Many Predictions Fail-but Some Don't. Penguin Group US, 2012.Google ScholarGoogle Scholar
  16. S. Sullivan. State of the art: The actuarial game of baseball@ONLINE. http://www.contingencies.org/mayjun04/stat.pdf, June 2004.Google ScholarGoogle Scholar
  17. F. Zimniuch and L. Smith. Fireman: The Evolution of the Closer in Baseball. Triumph Books, 2010.Google ScholarGoogle Scholar

Index Terms

  1. A data-driven method for in-game decision making in MLB: when to pull a starting pitcher

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
                August 2013
                1534 pages
                ISBN:9781450321747
                DOI:10.1145/2487575

                Copyright © 2013 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 11 August 2013

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • poster

                Acceptance Rates

                KDD '13 Paper Acceptance Rate125of726submissions,17%Overall Acceptance Rate1,133of8,635submissions,13%

                Upcoming Conference

                KDD '24

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader