Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

En provenance du cours de University of Houston System

Math behind Moneyball

36 notes

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

À partir de la leçon

Module 10

You will learn how Kelly Growth can optimize your sports betting, how regression to the mean explains the SI cover jinx and how to optimize a daily fantasy sports lineup. We close with a discussion of golf analytics.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

So let's show you an example of an optimization model for FanDuel basketball,

and the same principles would apply to other sports.

Okay, so if you log on FanDuel, I think this was during the 2014 season,

Kevin Love was playing.

Okay, you get a listing of every player who's playing that night.

And what they'll give you is their FanDuel points per game in other words,

its pretty close I think to an average of how many FanDuel points

the player averaged per game that they played.

So Kevin Durant averaged 50 points per game, Kevin Love 48, LaMarcus Aldridge 42.

It tells you how many games they played.

Okay.

It tells you Oklahoma City's playing at Cleveland that night.

Minnesota was playing at Houston.

Here's their salary, and you have $60,000 to spend and

you want to maximize your FanDuel points, okay.

And so, if you think about a solver model,

the target cell is maximize the points for the players picked.

Now we'll assume here that the FanDuel points is your best guess at how many

points each player would score.

And that's the key to being good at FanDuel.

I mean the software model's nice.

But the key to being good at FanDuel is to try and

figure out how many points you think a guy will score every night.

And where this is tricky as one of my students pointed out is okay,

let's suppose Tony Parker is hurt for the Spurs,

in the year they won the championship 2014.

Well you know Patty Mills is going to play more minutes so if Patty Mills has been

averaging 15 FanDuel points per game, he's going to score more that night

because Patty Mills is going to play more when Tony Parker's out.

And trying to juggle,

have a system to update projections based on injuries is important.

And in the playoffs of 2015 this reeked havoc

with anybody I think playing FanDuel.

You didn't know if John Wall was going to play.

You didn't know if Kyrie Irving was going to play.

I mean it seemed like every team except the Warriors had an injury.

Okay, I mean the Warriors played against four teams in the playoffs and

every one had an injured point guard.

So they'll win the title perhaps, after seeing that game one victory in overtime.

But, I mean, there's a lot of luck.

I mean, the Warriors haven't had any injuries this year.

Other teams have had severe injuries, so

next year my guess is regression towards the mean.

If the Warriors win the championship this year, that's why it's tough to repeat,

because if you won, things went well for you.

And then, the Spurs for

instance, have never repeated the year after they won the title.

Perhaps because they're worn out from all those playoff games.

But it's hard to repeat in any sport, ask the Seahawks, ask the Spurs.

You want to maximize the points for the players picked.

Now the changing cells are going to be 0-1.

What we call binary changing cells, a zero means, for each player.

So one means you pick the player.

Zero means you don't pick the player.

And we'll show you how to do this.

Now, Solver's got a limit of 200 changing cells this is in a linear model.

And in baseball for instance, in football you'd go way over this,

you'd have to buy a bigger version of the solver.

And solver.com, appropriately enough, will sell you a bigger version of the solver.

You know in Jaws they said, you're going to need a bigger boat.

Well if you want to play FanDuel baseball or football you want to use optimization,

you're going to need a bigger solvent.

Okay, so we've got the changing cells,

now the constraints are you need two point guards, two shooting guards.

Two power forwards,

two small forwards, and a center.

And you can't spend more than $60,000.

And this turns out to be what we call linear solver model,

which is really good because they're easy to solve for the computer.

So, nothing we've had so far is a [INAUDIBLE] solver model.

It's funny, our last solver model won't be linear.

And what defines a linear solver model is that the target cell is changing cells,

times constant added together.

And the constraints are the same way.

Okay, so our changing cells will be binary 0-1.

They'll be right here, I've just taken the top 100 or so players, okay.

And what we have to do, here's the position they played.

We have to create a matrix of 1's and

0's, to tell us which player plays which position.

So I can do an if statement.

So I can say, if the player's position, and

I need the $ column a, I hit F4 here, equals this position.

Now here I need the $ via row, then I'll put a 1, otherwise a 0.

So you have to create a matrix that tells you what position each guy plays.

So Kevin Durant plays shooting forward or small forward.

Okay, Kevin Love plays power forward as does Aldridge.

Here's the center DH, Dwight Howard.

Here's a point guard, Brandon Knight.

And Damian Lillard are point guards, as is Kyrie Irving, as is John Wall.

Okay.

So now I can figure out how many people played each position.

Okay.

So like for the point guards, I could sum up the pick column.

I could range any of these, but I'm not going to.

I guess I did, the pick column with the point guard column.

See what we've got there.

So if I look at, giving me an error message.

The pick column goes 9 through 128.

The point guard column goes 9 through 128.

Let's make sure that these are 0's or 1's.

So, if I would take the 0's and 1's here with the point guard column.

Okay, I guess I spelled sumproduct wrong.

Now, for the shooting guards, I could use sumproduct.

And I guess I could use that F3 trick.

The pick column with the point guard column.

Sorry, shooting guard column.

The small forward sumproduct the picked with the small forward column.

And the [INAUDIBLE] forward sumproduct the pick column.

With the power forward column.

And the center sum of product the pick column, and now center I believe it

means c_, because you can't name things c because it stands for column in Excel.

Okay.

Now how much did I spend?

I would sum of product the pick column.

because now the reason this works is whenever there's a 1 you pick up a number,

spent is a good way to explain this, why it's linear.

We sum a product depicted with the salary column.

So I'll say salary with the pick column.

Okay, whenever there's a 1 in the pick column it picks up the salary.

So for example, if I put a 0 here by Aldridge,

it should reduce the salary by 9,300 that we've used.

See, it does.

And it's linear because it's changing cell times constant added together.

So everything we're doing here is linear.

Because like when I said how many point guards,

it was changing cell times constant added together.

When I said, that's what allows you to use the linear model and

makes the solver much more efficient in these problems.

because, afterall, I mean with 120 players,

I could either pick each one or not.

I mean, basically, each player, pick 'em or not,

in effect it's not quite this simple.

But, it's 2 to the 120th possible lineups, and there's more players than this.

That's 10 to the 36 possible ways to pick players.

Of course some of those are not possible because they would involve more than nine

players but there are a lot of ways you can pick nine players.

Well, if you want to pick 9 players out of 128,

you could say combinations,

that this would have some lineups with more than two shooting forwards.

Okay, 10 to the 13th.

Okay, which is 2 with 12 zeroes.

Well, it's 20 with 12 zeroes after it.

It would be 20 trillion, I guess.

That's a lot of possible lineups.

Now, finally, our points, we would sum a product the pick column,

with the, I think it's the FPPG column.

Okay, so what we want to do is maximize the red by, let's change this to yellow.

Change the yellow to maximize the red and

make sure we have the right number of players.

Okay, at each position and don't spend too much money.

So I've got the solver window in there, but let's reset it.

Okay, so I want to maximize my points.

I want to change the pick.

So I can hit F3 and, I guess put in pick.

The pick should be binary, now how do I make those 0 and 1?

I hit F3, put in pick, and down here I

can say we did the different integer, I don't think we needed, but we can do BIN.

And I could say the number at each position equals what I need.

And finally, spend less than $60,000.

So, spend less than 60,000, have the right number at each position,

everything's non-negative.

You don't really need that, but if I click solve here, okay,

I should get the right answer, 281.1.

Now, a little tip, when you do binary models, there’s an option.

It's called Integer Optimality Percentage.

If you make that a really small number, then you’re sure to get the right answer.

If you make that like a 5,

it may stop when it gets within 5% of the right answer.

So I don't think that's going to change anything.

Okay.

Now which players did I pick?

Well let's highlight them in red, we could use conditional formatting.

Whenever there's a 1 in the row, let's highlight the whole row.

Okay, so I do Home > Conditional Formatting > New Rule, use a formula.

And we'd say if the pick em sell, and

I should $ the column here, not the row.

Let's say, greater or

equal to 0.99 because sometimes due to round off it doesn't equal 1.

If that’s true, you need to $ the column, so when it copies across,

it still pulls from column C.

We would, say, let’s do font red.

So you can see we picked Aldridge, Curry, Wall, Klay Thompson,

Pekovic, Porchemo, Tristan Thompson, and Jordan Crawford.

And that was a [INAUDIBLE] to get about 280 points.

Okay.

Now that would get you one lineup.

Now let's suppose you want another lineup.

Well, let me just Move or Copy Sheet.

Make a copy.

Another lineup.

Well, that would be almost as good, because you want to sort of hedge your

bets and not put everything with one player.

Well, all you gotta do is say,

if you add a constraint that your target cell is less than or equal to 281.

That'll rule out the solution you have, and you'll get sort of the second best

lineup, and that should give you another pretty good lineup.

But this all hinges on your predictions, which is the hard part.

Okay, so there I got, the second best lineup I could get was 280.9.

Okay, and that's going to involve different players,

I mean it probably swapped out somebody.

It put in Wesley Matthews, okay, instead of Jordan Crawford on,

well slightly different line up.

You could put in Wesley Matthews and took somebody out, I'm not sure.

Okay.

But basically, I could add another line up and say hey, less than or equal to 280.

Or I mean I could say less or equal to 280.8.

And so I can generate a bunch of good lineups based on my projections this way.

So I found one with 280.6.

Okay, it put in Russell Westbrook.

So there are a lot of lineups that sort of give the same expected points,

if this is the right expected points.

Okay, now what's the problem projecting expected points.

Let's suppose you're looking at James Harden, and

you might say look at his last 10 games.

So last 10 games, he averaged 40 FanDuel points.

Okay, the problem is you should adjust this for,

suppose it was a ten game road trip, okay.

Well he's not going to play as well on the road as at home so maybe that was like,

really 41 FanDuel points.

And maybe he played really tough defenses like the Utah Jazz and the Spurs.

So maybe it was like 44 FanDuel points if you adjust for that.

So maybe adjust it, and this I think is the key to being good at FanDuel,

we don't have time to talk about how these adjustments would be made.

In baseball, it'd be really, adjusted FanDuel points might be 44,

well then tonight he's playing the 76ers, who have a lousy defense.

So the last 10 games adjusted,

he averaged 44 FanDuel points, not the 40 that you'd look at his last 10 games.

And since he was playing at a 44 point FanDuel level, tonight he's playing

the 76ers and maybe their defense is 10% worse, you'd bump him and he's at home.

You'd bump him up to 48 FanDuel points, and this can make a huge difference.

Okay, now what makes it really hard in baseball,

if you want to do these adjustments, is the park.

We know the San Diego park, there's a lot less runs are scored at the San Diego

park than the Colorado Rockies park, so you's have to adjust for that.

And then you have to adjust the hitter, if he's a left handed hitter,

how will he do against the left handed pitcher versus a right handed pitcher.

And then you don't even know if they're going to play the guy,

because they might put and sit him out against the left handed starting pitcher.

So coming up with predicted FanDuel points is really, really hard.

But if you could come up with good predicted FanDuel points,

this is the way you should pick your lineup for a double-up or a triple-up.

Now, for a tournament you'd have to factor in variability.

Because to win a tournament you need a lineup that has sort of high variability.

Maximizing the expected points won't win you a tournament.

You need to have basically, maximize the probability of getting, let's say at least

a certain number of points and that's beyond the scope of where you want to go.

But hopefully, that'll get you sort of interested in this exciting world of

fantasy sports, and link it with math.

Coursera propose un accès universel à la meilleure formation au monde,
en partenariat avec des universités et des organisations du plus haut niveau, pour proposer des cours en ligne.