In this lecture, we'll learn about Minimax search.

What is Minimax search?

It's a technique to pick the best move in an adversarial two-player game.

Why do we care?

Because this is a big help when we're

developing artificial intelligence in a turn-based game.

Let's look at an example.

This example comes from data structures for game programmers by Ron Penton.

He uses rocks, but of course,

I'm using teddy bears.

So the idea behind this game is that you never want to take the last teddy bear.

On your turn, you are allowed to take any number of

teddy bears from a single bin and you have to take at least one teddy bear.

So this tree is rooted at the configuration shown

at the top of the two teddy bears and the left bin and one teddy bear on the right,

and its Player ones turn.

And the next level of the tree,

and this is often called a ply,

Laken plywood, there are different layers of wood,

this is a ply in the tree.

So, one step down is

all the different configurations that could occur from Player one taking their turn.

So on the left side,

we have a Player one taking one teddy bear from the left bin.

In the middle, we have Player one taking two teddy bears from the left bin.

And on the right, we have Player one taking one teddy bear from the right bin.

So, to generate our Minimax search tree,

we're basically going through

the different game configurations that occur as people take turns.

Now it's Player two's turn.

And on the left sub-tree,

you can see Player two can either take

the teddy bear from the left Bin or the teddy bear from the right bin.

In the middle, the only thing Player two can do is

take the single teddy bear from the right bin,

and that leads to Player two losing because Player

two lost by taking the last teddy bear.

On the right hand side,

Player two has two choices.

On the right hand side,

Player two can be particularly stupid and take both teddy bears out of the left bin,

thereby losing the game,

or they can take one teddy bear out of the left bin.

And then on the next ply down,

it's Player one's turn,

and either they've won in that middle sub tree or all the way in the right,

or they have a configuration that only contains one teddy bear,

so they're going to have to take that teddy bear which leads to them losing the game.

So this is the game tree for a complete play of don't take the last Teddy.

That's the new name and that's the programming assignment for this module.

You'll, of course, extend what we're doing here.

But once we have the game tree,

we need to help Player one figure out what they should do for their turn.

Now, I've marked the game tree with the Minimax scores.

So, at the very top at the root of the tree,

Player one is the maximizing player.

They want to get the highest score possible,

so the way we're applying the Minimax scores down on

the leaf nodes of the tree is if player one lost, they get zero.

And if player two lost, they get one.

So, Player one wants to make a move from

the route that gives them the best chance of scoring one.

Player two on the next ply down is a minimizing player.

They wanted to try to minimize the score that

Player one can achieve so they'll make the worst move for Player one.

And then the next play down,

Player one takes their turn and they're trying to maximize their benefit, and so on.

So that's why this is called a Minimax search is because we're alternating

between minimizing and maximizing even though we start with a maximizing move.

Even though I've marked the game tree all the way up to the root,

we need to understand how that works.

So assume that I have now marked all these leaf nodes with

the appropriate scores and we'll see in

the next lecture that we implement this with recursion,

so now we back out of the recursion on

the maximizing apply just above those bottom leaf nodes.

So here, as we back out of the recursion from Player one lost,

Player one picks the maximum score in all its children.

It only has one child,

so it just has to pick zero.

Zero happens to be both maximum and minimum,

but it picks the best number.

It also does this on the next sub-tree.

And again, no choices.

This one, we're actually lucky Player one is going to score one at this ply of the tree.

Here, Player one picks the maximum of all its one child.

So it has to be zero,

and here's another place where Player one wins.

So we're backing up out of the recursion from doing our Minimax search,

and we just did the max level,

so now we need to do the min level.

So, the minimizing player is going to pick the lowest score of all its children.

Now on this left sub-tree,

they're both zero, so it picks zero.

On this middle sub-tree,

It only has one child,

so it has to accept that score.

And finally, we see an actual decision,

so the children have scores of zero and one.

And because we are on the minimizing ply,

we pick the lowest score between zero and one,

so we pick zero.

Because remember, Player two is trying to do whatever is worst for Player one.

But now, Player two is done,

this level of the recursion is done,

and so we backup to the root,

and Player one will decide between these three children.

One child has a score of zero,

one child has a score of one,

and one child has a score of zero.

So Player one, because that player is a maximizing player,

picks the highest score of the children.

And that's a one, and that makes it clear that player

one's next move should be to take two teddy bears out of the left hand bin.

Don't lose sight in the mechanics of figuring out how all this works.

Our end goal is to tell the root of the tree,

what should you do?

And of course, that means that this is great for AI.

We're trying to help AI make decisions,

but that's how the Minimax process works In the small example.

To recap, in this lecture,

we learned about Minimax search.

In practice, we usually limit the depth of search that we perform,

because in most cases,

the fan out, the branches out of a particular game state are

so high that we can't go all the way to the end of the game.

So what we do is we limit the depth of

our search and we use something called a heuristic evaluation function

to assign a score based on whatever criteria we come

up with for a particular game configuration.

Common more advanced techniques like

alpha beta pruning and iterative deepening are built on top of Minimax search.

So alpha beta pruning is a technique that uses many Minimax search,

but when we start exploring a branch that has

no possibility of being better than our current best solution,

we just prune that branch and we don't explore it any more.

Iterative deepening is a technique where we

perform Minimax search to one level and saved that result,

then perform Minimax search to two levels and save that result, and so on.

And this is a really useful technique when we have

time constraints on how long we can execute the search.

So in our peak game studios,

game the cat, we actually used iterative deepening.

Because based on the selected difficulty,

the AI could spend a certain amount of time looking for their next move.

And if we used iterative deepening, which we did,

we could actually always have some solution when time ran out,

but we could keep looking for a better and better solution until time ran how.