Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

En provenance du cours de University of Houston System

Math behind Moneyball

36 notes

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

À partir de la leçon

Module 1

You will learn how to predict a team’s won loss record from the number of runs, points, or goals scored by a team and its opponents. Then we will introduce you to multiple regression and show how multiple regression is used to evaluate baseball hitters. Excel data tables, VLOOKUP, MATCH, and INDEX functions will be discussed.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

Hi, I want to welcome you to our MOOC on the mathematics behind Moneyball.

My name is Wayne Winston.

I'm a clinical professor of decision and information sciences at

the Bauer College of Business at the University of Houston in Houston, Texax.

It's been a fairly good year for Houston sports.

With the Astros doing very well and

the Rockets going to the Western Conference Finals.

And a lot of that's due to Moneyball with Daryl Morey running the Rockets who's

a MIT Sloan School MBA.

And, the front office at the Astros really being

touted as the most analytic savvy front office perhaps in all of sports.

Sports Illustrated had a nice cover on the Astros saying

World Series Champions 2017 mainly because of analytics, and

the Astros may be ahead of schedule, who knows?

Okay, so we've got a file, sports syllabus, that will give you a list of

the topics we're going to cover and so let me give a brief overview of these topics.

We'll start with the famous Pythagorean theorem, not from geometry but

it comes from baseball.

Basically the idea behind the Pythagorean theorem is how

many games will a team win if you know how many runs they've scored and

given up in baseball or points in football or basketball or goals in hockey.

We'll spend some time on that,

you'll see it's important because it helps you understand how valuable

a player is if you know that scoring ten more runs will win you one more game.

Then we'll talk about the math of baseball because that's where this sort of all

started, largely thanks to the famous Bill James and

we'll talk about how you evaluate hitters.

We'll talk about how you evaluate pitchers and fielders.

We'll learn a bunch of statistics,

including multiple regression will be a key tool in our tool kit.

And you know I'm like Schultz on Hogan's Heroes in this course, I mean probably

a lot of you younger people don't know Schultz was on Hogan's Heroes but he said,

I know nothing.

So I assume you guys don't know anything about the topic and I'll try and

teach you everything.

So if you're intelligent and you work hard, you should learn a lot here.

Some of you may know a lot of this stuff, I'm not sure.

Okay, but then basically we, after we're done with,

while we're studying baseball we'll learn a lot about Excel.

Functions in Excel, conditional formating,

pivot tables that we'll need to analyze a lot of sports data.

So you'll learn a lot of Excel, as well as learning a lot about sports and math.

Then we'll talk about an important tool called Monte Carlo simulation.

Playing out on a certain situation, over and

over, which is pretty useful way to evaluate a baseball team if

you're going to add a player to the lineup, how will your team do?

And then we'll even talk about deflategate in the context of simulation.

And yes, I sort of do think the Patriots were doing something to these footballs

for a while.

Then we'll talk about baseball decision-making.

If you saw the Moneyball movie or read the Moneyball book by Michael Lewis, you know

that most aficionados in Moneyball think you shouldn't put much in baseball.

We'll talk about why that's true.

Other things in decision making, talk about evaluating fielders as I said,

evaluating pitchers, the famous concept of wins above replacement which is

actually used fairly often now to determine sports writers' MVP votes.

I mean Mike Trout should have won one year over Miguel Cabrera even if we

won the triple crown, and we'll discuss that.

Okay. And then we'll talk about some newer

developments in baseball, the shifts that you've been seeing in baseball, okay?

There's a great book, Big Data Baseball,

about the pirates sort of starting this trend towards shifting.

And how do you evaluate a catcher's defense?

Catcher framing,

a catcher can sort of make a pitch look like a strike when it's not.

How the park a player plays and

affects your evaluation of his hitting ability or pitching ability.

Then we need to learn a little bit of statistics.

What's a random variable?

What's a normal random variable?

Talk about the famous hot hand fallacy.

Do players get red hot?

Do teams get red hot?

And then we'll turn our attention to I guess America's game, football.

What makes NFL teams win?

We'll find out, it's mainly yards per pass attempt.

We'll try and understand the famous NFL quarterback rating formula

using regression.

And then we'll get into something extremely important,

football decision-making.

For instance, how do football teams,

do football teams go forward enough up on fourth down?

Do they have the right run-pass mix and things like that?

The famous Bill Belechick decision against the Colts to go for it on fourth and

two when the Colts were trailing the Patriots by six points.

It was the right decision,

even though a lot of announcers really did not think it was.

Okay, then we'll talk about how you can use great stuff on footballreference.com.

We'll be using all their sites, basketball reference,

baseball reference, they're just invaluable to the sports analysts group.

We'll really talk about sports analytics when we talk about

the math behind Moneyball, that's the term that's used often.

And we'll talk about how you can use text functions and the play by play data

on pro-football-reference.com to evaluate teams' play selection.

The Texans, for instance, are really horrible at running plays on 1st to 10,

but they did a lot of them.

Then we'll talk about two person zero sum game theory, which explains why,

even if your passing offense is better than your running offense,

you shouldn't always pass.

And if your running offense gets better,

you should actually maybe run less than you did before surprising them.

Then we'll get to basketball which I feel like know the most about.

Because with my colleague and friend Jeff Sager and of USA Today we worked with

Marc Cubin in 1999 and 2000 to develop a system to rate players and

lineups and a lot of work has been done,

sort of I think off our work to really advance basketball analytics.

We'll talk about shot selection and the Lakers and

Knicks shot selection in 2014-2015 was atrocious.

The Rockets shot selection was great.

We'll talk about the four factors that make a basketball team win and

some of John Hollinger's great work on team metrics.

Points per possession which I think you've heard of.

Then we'll talk about how to figure out how good a player is, a basketball player.

You Sshould look at the box score,

you should look at how the player moves the score of the game.

We'll talk about how to evaluate lineups.

A concept Jeff and I developed the adjusted plus-minus which is

now ramped into ESPN's pretty good regularized real plus-minus system.

We'll talk about how analytics help the Mavs beat the Heat.

Some of the new data that's available in basketball,

sports view data which really fits the definition of big data.

And then we'll talk about using Excel and optimization to set point spreads.

How do you rate NFL teams, find the strength of schedule,

how do you figure out a prediction for the total score of a game?

How can you predict the score of a soccer game?

We'll talk about rating teams just based on wins and losses because the BCS

required that, which was sort of silly but didn't do anything about it.

How to simulate the NCAA tournament, to figure out the odds on a team's winning?

I didn't think there were some bets Vegas gave out on Kentucky that were pretty

strange, they're called props bets.

We'll talk about that.

How do you simulate the NFL playoffs at the beginning of the playoffs?

Theory and figure out the odds of each team winning, and the last

two years basically, I think we had the Seahawks and the Patriots to win going in.

How can you rate NASCAR drivers if you have place finishes in each race?

Then we'll talk a bit about gambling.

You may know the money line is how you bet on a game on who wins, and

not to be go about point spreads and they're related.

If you know the point spread, you can figure out the money line and vice versa.

How can you tell somebody has a successful betting system?

How do props bets work?

Know that you can bet on the first score in the Broncos-Seahawks game

being a safety.

And what were the odds on that?

How did Vegas figure them out?

Were they correct?

Then if you do have a successful method for

betting, what percentage of your money should you bet on each bet?

Even if you're picking 80% against the points spread, you shouldn't bet 100% of

your money each time because then one loss will wipe you out.

What is sports arbitrage?

Sometimes there could be opportunities to

lock in a small profit before the money line or the point spread changes.

And then the big trend is toward these daily fantasy sports games you hear

advertised, Skill Zone, Vandor, DraftKings, etc.,

on ESPN, radio and TV, probably 20 times a day you see ads for those.

And we'll use Vandor, for example, but how does daily fantasy sports work?

And how can you can use optimization to come up with your best daily

fantasy lineup?

And we'll have test questions throughout the course, of course and other problems.

So how should you study for this course?

Start with suggested readings here, and

they're both by me so I guess I think they're okay.

You really don't need to buy these.

I mean, I think if you just watch the videos

you'll be able to learn everything you need.

So I wrote a book Mathletics.

The paperback is with Princeton University Press in 2009,

a lot of what we'll do is in that book.

Although I think every day that book becomes more outdated because it feels

it's changing by the day.

But there's a lot of stuff we'll do that is explained in more detail in Mathletics.

And then I have a book with Microsoft Press,

Data Analysis and Business Modeling with Excel 2013.

And we'll be using Excel 2013.

And Mac users, you need to run a Windows version

of Excel to do some of the things that we're going to do.

I can't do anything about that, I'm sorry.

Okay, so how should you study?

So for each video, you can see there's a list of the videos here.

I haven't finished all of the videos and some of these are at this point.

Okay, but you should open up the starting file that goes with each video.

See there's the starting file in column I, and

then you should watch the video, pause it if needed and

follow along with the starting file and try and do what I do.

That, I mean, at Microsoft they would say everybody likes the drive to learn how to

do things, and that means basically follow along, just be an active learner, okay?

And then once you finish and think you understand what's in the video,

there will be a homework problem.

Try the homework problem on your own, you'll have the answer.

And then you can try the test question that goes with that homework problem.

And basically your grade will be totally based on the test questions.

So 90% and above will be passed with distinction.

70%, I guess, to 89.9% will be passed,

and below 70% will be failed.

Of course, you will not get credit for the course, okay?

Or another way you can approach the class is watch the video in its entirety and

concentrate on what I do and then try and go back with the starting file.

And do it from scratch yourself and see if you can duplicate what I did and

then try the homework problem and attempt the test question.

But that should get you going on how to learn about this subject which I'm

sure you're interested in, okay?

You like math and we all love sports and a lot of us are fascinated by the way

math can help a team win, okay, make a team better.

And that's what we're here to learn about and we'll learn a lot of interesting

math along the way like regression, simulation, win sampling, optimization.

So, hopefully you'll get started on the course and I hope you'll enjoy it.

Coursera propose un accès universel à la meilleure formation au monde,
en partenariat avec des universités et des organisations du plus haut niveau, pour proposer des cours en ligne.