Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

En provenance du cours de University of Houston System

Math behind Moneyball

40 notes

At Coursera, you will find the best lectures in the world. Here are some of our personalized recommendations for you

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

À partir de la leçon

Module 1

You will learn how to predict a team’s won loss record from the number of runs, points, or goals scored by a team and its opponents. Then we will introduce you to multiple regression and show how multiple regression is used to evaluate baseball hitters. Excel data tables, VLOOKUP, MATCH, and INDEX functions will be discussed.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

Okay, so let's talk about a concept that was discussed a lot in the Moneyball book

and the movie I believe, OPS.

So rating hitters based on OPS, which is on base percentage,

Now you'd add slugging percentage.

So slugging percentage again, you get one point for

a single, two for a double, three for a triple, four for a home run, and

basically you take that weighted average of your hits,

your weighted total of your hits, divided by your at bats, on base percentages.

Going to be walks plus hit by pitcher plus hits.

Okay, so on base percentage here,

what I did is I took hits plus walks plus hit by pitcher and

I divided by at bats plus hit by pitcher plus walks

plus I guess sacrifice flies okay.

because sacrifice fly is a plate appearance there.

You can leave that out if you want to.

Okay, and I probably should have had sacrifice hits in there and

bunts, but I guess I didn't do that.

That's not going to change things much.

Slugging percentage, you get one point for a single.

Two points for doubles we said, three points for a triple, four points for

a home run, and divide that by at bats.

So the question is how can we do

predicting run score from these two simple numbers rather than use singles,

doubles, triples, homers, and walks and hit by pitch.

This is just a much simpler way to look at predicting runs scored by a team, and

also it can also turn into a simpler way to evaluate a hitter,

because you only have to look at two things.

Slugging percentage sort of has power in it.

There are more complicated measures of power.

I think there's an isolated power measurement but

we're not going to get into that.

But if you run the regression, this is your y column,

these are your x columns and

you go to the worksheet Slugging On Base percentage you get this equation here.

So your best way to predict run score for the season.

Would be -1,003+1,700

SLG+3,157 OPS, on base percentage.

And these are all very significant.

Look at those p values, they're like 0.29 zeros.

They really do matter predicting runs scored.

And basically, that's a nice simple way to rank hitters.

If you would take the slugging percentage plus around two times the OPS,

you can in your head basically figure out hey,

is this guy per plate appearance a better hitter than the other guy.

And so that's a nice way to look at things.

By the way, I checked on outliers.

The standard error here is 26,

95% of our forecast, accurate within double that.

Let's say 51 runs.

And that's slightly less accurate than using the linear weights but not much and

the R squared is 91%.

So it's nice to have a sort of simpler model with only two independent variables

that's basically almost as good as the prediction model with the linear weights,

because it's easier to understand and it's just about as accurate.

So that's really pretty nice okay.

So that explains, again, the on base plus slugging should really be on base

slugging plus two times on base because the weight of on base percentage

is approximately double the weight of slugging.

Well in the next video we're going to learn how to use linear weights to

sort of figure out how many runs would a team

made up of nine of a single player's score per game.

And that's sort of a nice metric to look at of one hitter is better than the other.

And say, okay if I have nine Barry Bonds 2004, I'd score 16 runs per game.

If I had nine Mike Trouts 2014, I might score eight runs per game.

And we'll see later maybe a more important measure is runs above average.

If you would add a player to a line up of where everybody else is an average hitter.

How many more runs would your team score, or how many less runs would your team

score, and so we'll get to that in the next video.

Coursera propose un accès universel à la meilleure formation au monde,
en partenariat avec des universités et des organisations du plus haut niveau, pour proposer des cours en ligne.