Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

From the course by University of Houston System

Math behind Moneyball

24 ratings

University of Houston System

24 ratings

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

From the lesson

Module 1

You will learn how to predict a team’s won loss record from the number of runs, points, or goals scored by a team and its opponents. Then we will introduce you to multiple regression and show how multiple regression is used to evaluate baseball hitters. Excel data tables, VLOOKUP, MATCH, and INDEX functions will be discussed.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

Okay, so let's talk about a concept that was discussed a lot in the Moneyball book and the movie I believe, OPS. So rating hitters based on OPS, which is on base percentage,

Now you'd add slugging percentage. So slugging percentage again, you get one point for a single, two for a double, three for a triple, four for a home run, and basically you take that weighted average of your hits, your weighted total of your hits, divided by your at bats, on base percentages. Going to be walks plus hit by pitcher plus hits. Okay, so on base percentage here, what I did is I took hits plus walks plus hit by pitcher and I divided by at bats plus hit by pitcher plus walks plus I guess sacrifice flies okay. because sacrifice fly is a plate appearance there. You can leave that out if you want to.

Okay, and I probably should have had sacrifice hits in there and bunts, but I guess I didn't do that. That's not going to change things much. Slugging percentage, you get one point for a single. Two points for doubles we said, three points for a triple, four points for a home run, and divide that by at bats. So the question is how can we do

predicting run score from these two simple numbers rather than use singles, doubles, triples, homers, and walks and hit by pitch. This is just a much simpler way to look at predicting runs scored by a team, and also it can also turn into a simpler way to evaluate a hitter, because you only have to look at two things. Slugging percentage sort of has power in it. There are more complicated measures of power. I think there's an isolated power measurement but we're not going to get into that. But if you run the regression, this is your y column, these are your x columns and you go to the worksheet Slugging On Base percentage you get this equation here. So your best way to predict run score for the season.

Would be -1,003+1,700 SLG+3,157 OPS, on base percentage. And these are all very significant. Look at those p values, they're like 0.29 zeros. They really do matter predicting runs scored. And basically, that's a nice simple way to rank hitters. If you would take the slugging percentage plus around two times the OPS, you can in your head basically figure out hey, is this guy per plate appearance a better hitter than the other guy. And so that's a nice way to look at things. By the way, I checked on outliers. The standard error here is 26, 95% of our forecast, accurate within double that. Let's say 51 runs.

And that's slightly less accurate than using the linear weights but not much and the R squared is 91%. So it's nice to have a sort of simpler model with only two independent variables that's basically almost as good as the prediction model with the linear weights, because it's easier to understand and it's just about as accurate. So that's really pretty nice okay.

So that explains, again, the on base plus slugging should really be on base slugging plus two times on base because the weight of on base percentage is approximately double the weight of slugging. Well in the next video we're going to learn how to use linear weights to sort of figure out how many runs would a team made up of nine of a single player's score per game. And that's sort of a nice metric to look at of one hitter is better than the other. And say, okay if I have nine Barry Bonds 2004, I'd score 16 runs per game. If I had nine Mike Trouts 2014, I might score eight runs per game. And we'll see later maybe a more important measure is runs above average. If you would add a player to a line up of where everybody else is an average hitter. How many more runs would your team score, or how many less runs would your team score, and so we'll get to that in the next video.

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.