1:27

So if you had a path like this with three edges and length one, two and

three, then the length of the path would just be six.

And then we define the shortest SV path in the natural way, so amongst all of

the paths directed from S to V, each one has its own respective path length and

then the minimum overall SV paths is the shortest path distance in the graph G.

So I'm going to make two assumptions for these lectures.

One is just really for convenience, the other is really important.

The other assumption, without which,

Dijkstra's algorithm is not correct, as we'll see.

2:01

So for convenience we'll assume that there is a directed path from

S to every other vertex V in the graph,

otherwise the shortest path distance is something we define to be plus infinity.

And the reason this is not a big assumption is,

if you think about it, you could detect which vertices are not reachable from

S just in a preprocessing step using, say, breadth-first or depth-first search.

And then you could delete the irrelevant part of the graph, and

run Dijkstra's algorithm as we'll describe it on what remains.

Alternatively, Dijkstra's algorithm will quite naturally figure out

what vertices there are paths to from S and which ones there are not, so

this won't really come up.

So to keep it simple, just think about we have an input graph

where you can get from S to V, for every different vertex V.

And the challenge then is amongst all the ways to get from S to V,

what is the shortest way to do it?

3:05

Now in the context of a driving directions application

it's natural to ask the question why would you ever care about negative edge lengths.

Until we invent a time machine that doesn't seem like

negative edge lengths are going to be relevant when you are computing literal

paths through literal networks.

But again remember that paths can be thought of as more abstractly as a just

sequence of decisions.

And some of the most powerful applications of shortest paths

are coming up with optimal weight such sequences.

So, for example, maybe you're engaging in financial transactions and

you have the option of both buying and selling assets at different times.

If you sell then you get some kind of profit and

that would correspond to a negative edge length.

So there are quite interesting applications in which negative edge

lengths are relevant.

If you are dealing with such an application,

Dijkstra's algorithm is not the algorithm to use.

There's a different shortest path algorithm, a couple other ones.

But the most well-known one is called Bellman-Ford.

That's something based on dynamic programming,

which we may well cover in a SQL course.

Okay, so for Dijkstra's algorithm, we always focus on graphs.

That'll have only non-negative edge lengths.

So, with the next quiz, I just want to make sure that you understand the single

source shortest path problem.

Let me draw for you here a simple four node network, and ask you for,

what are the four shortest path lengths.

So from the source vertex s, to each of the four vertices in the network.

All right, so the answer to this quiz is the final option, 0,1,3,6.

To see why that's true, well,

all of the options had 0 as the shortest-path distance from s to itself.

So that just seemed kind of obvious.

So the empty path will get you from s to itself and have 0 length.

No suppose you wanted to get from S to V, well there's actually only one way to do

that, you have to go along this one hop path.

So the only path has length of one, so the shortest path distance from S to V is one.

Now W's more interesting, there's a direct one hop path, SW,

that has a length of four, but that is not the shortest path from S to

W Inf act to two-hop path that goes through v as an intermediary has

total path length three which is less than the length of the direct arc from s to w.

So therefore the shortest distance from s to w is going to be 3.

And finally for the vertex t there's three different paths going from s to t.

There's the two-hop path that goes through v.

There's the two hop path which goes through w, both of those have path length

7, and then there's the three hop path which goes through both v and w.

And that actually has a path length of one plus two plus three equals six.

So despite having a largest number of edges, the zigzag path is, in fact,

the shortest path from s to t and it has length 6.

All right, so before I tell you how Dijstrka's algorthin works,

I feel like I should justify the existence of this video a little bit.

All right?

Because this is not the first time we've seen shortest paths.

You might be thinking rightfully so.

We already know how to compute shortest paths.

That was one of the applications of breadth first search.

6:30

So for example, in the graph that we've explored in the previous quiz,

if we ran breadth first search, starting from the vertex s,

it would say that the shortest path distance from s to t is 2 and

that's because there's a path with two hops going from s to t.

Put differently, t is in the second layer emanating from s.

But as we saw in the quiz, there's not in fact

a shortest two hop path from s to t if you care about the edge lengths.

Rather the minimum length path, the shortest path, with respect to the edge

weights, is this three hop path which gives us a total length of 6.

So breadth first search is not going to give us what we want

when the edge lengths are not all the same.

And if you think about an application like driving directions, then needless to say,

it's not the case that every edge in the network is the same.

Some roads are much longer than others,

some roads will have much larger travel times than others, so

we really do need to solve this more general shortest path problem.

Similarly, if you're thinking more abstractly about a sequence of decisions

like financial transactions,

in general different transactions will have different values.

So you really want to solve general shortest paths,

you're not in the special case that breadth-first search solves.

7:48

General edge weights, unit edge weights, it's basically the same.

Say you have an edge that has length three.

How is that fundamentally different than having a path with three edges,

each of which has length one?

So why not just replace all the edges with a path of edges of the appropriate length?

Now we have a network in which every edge has unit length and

now we can just run breadth-first search.

So put succinctly, isn't it the case that computing shortest paths with general

edge weights reduces to computing shortest paths with unit edge weights?

Well, the first comment I want to make is I think this would be an excellent

objection to raise.

And indeed, as programmers, as computer scientists,

this is the way you should be thinking.

If you see a problem that seems superficially harder than another one,

you always want to ask, well, maybe just with a clever trick I can reduce it

to a problem I already know how to solve.

That's a great attitude in general for problem solving.

And indeed, if all of the edge lengths were just small numbers, like 1, 2, and

3 and so on, this trick would work fine.

8:49

The issue is when you have a network where the different edges can have very

different lengths.

And that's certainly the case in many applications.

Definitely road networks would be one where you have both sort of long

highways and you have neighborhood streets.

And potentially in financial transaction based networks you would also have a wide

variance between the value of different transactions.

And the problem then is some of these edge lengths might be really big.

They might be 100, they might be 1,000.

It's very hard to put operating bounds on how large these edge weights could be.

So if you start wantonly replacing single edges with these really long paths

of like 1,000, you've blown up the size of your graph way too much.

So you do have a faithful representation of your old network, but

it's too wasteful.

So even though breadth-first search runs in linear time,

it's now on this much larger graph.

And we'd much prefer something which is linear time or

almost linear time that works directly on the original graph.

And that is exactly what Dijkstra's shortest-path algorithm

is going to accomplish.

9:49

So this is another one of those algorithms where no matter how many times I explain

it, it's always just super fun to teach.

And the main reason is because it exposes the beauty that pops up in good algorithm

design.

So the pseudocode, as you'll see in a second, is itself very elegant.

We're just going to have one loop, and in each iteration of the loop we will compute

the shortest path distance to one additional vertex.

And by the end of the loop we'll compute shortest path distances to everybody.

The proof of correctness, which we'll do in the next video, is a little bit subtle,

but also quite natural, quite pretty.

And then finally, Dijkstra's algorithm will give us our first opportunity to see

the interplay between good algorithm design and good data structure design.

So with a suitable application of the heap data structure,

we'll be able to implement Dijkstra's algorithm so it runs blazingly fast,

almost linear time, namely m times log n.

But I'm getting little ahead of myself.

Let me actually show you this pseudocode.

10:43

At a high level, you really should think of Dijkstra's algorithm as being a close

cousin of breadth-first search.

And indeed, if all of the edge lengths are equal to one,

Dijkstra's algorithm becomes breadth-first search.

So this is sort of a slick generalization of breadth-first search

when edges can have different lengths.

So like our generic graph search procedures, we're going to start at

the source vertex s, and in each iteration we're going to conquer one new vertex.

And we'll do that once each iteration after m minus1 iteration, we'll be done.

And in each iteration will correctly compute the shortest path distance

to one new possible destination vertex v.

11:58

Now to help you understand Dijkstra's algorithm,

I'm going to do some additional bookkeeping which you would not do

in a real implementation of Dijkstra's algorithm.

Specifically, in addition to this array capital A in which we

compute shortest path distances from the source vertex to every other destination,

there's going to be an array capital B in which we'll keep track

of the actual shortest path itself from the source vertex s to each destination v.

So the arrays A and B will be indexed in the same way.

There'll be one entry for each possible destination vertex v.

Capital A will store just a number for each destination, shortest path distance.

The array B in each position will store an actual path,

so the shortest path from s to V.

But again, you would not include this in an actual implementation.

I just find in my experience it's easier for students to understand this algorithm

if we think of the paths being carried along as well.

12:49

So now that I've told you the semantics of these two arrays,

I hope it's no surprise how we initialize them for the source vertex itself, s.

The shortest path distance from s to itself is 0.

The empty path gets you from s to s with length 0.

There's no negative edges by assumption, so

there's no way you can get from s back to s with a non-positive length, so

this is definitely the shortest path distance for s.

By the same reasoning,

the shortest path from s to s is just the empty path, the path with no edges in it.

14:09

So that motivates that a given iteration of the while loop

to look at the stuff we've already process, that's X, and

the stuff we haven't already processed, that's v minus X.

s, of course, starts in X and we never take anything out of X, so

s is still there.

In some generic iteration of the while loop,

we might have some other vertices that are in X.

And in a generic iteration of this while loop,

there might be multiple vertices which are not in X.

And now, as we've seen in our graph search procedures, there are general or

edges crossing this cut.

So there are edges which have one endpoint in each side, one endpoint in X and

one endpoint outside of X.

This is a directed graph so they can cross in two directions.

They can cross from left to right or they can cross from right to left.

14:51

So you might have some edges internal to X.

Those are things we don't care about at this point.

You might have edges which are internal to v- X.

We also don't care about those, at least not quite yet.

And then you got edges which can cross from X to v-X,

as well as edges that can cross in the reverse direction, from v -X back to X.

And the ones we're going to be interested in, just like when we did graph search and

directed graphs, are the edges crossing from left to right,

the edges whose tail is amongst the vertices we've already seen and

whose head is some not yet explored vertex.

15:28

So the first idea is that in each iteration of the while loop we scan, or

we examine, all of the edges with tail in X and head outside of X.

One of those is going to lead us to the vertex that we pick next.

So that's the first idea, but now we need a second idea,

because this is again quite underdetermined.

There could be multiple such vertices which meet this criterion.

So for example, in the cartoon in the bottom left part of this slide,

you'll notice that there's one vertex here.

Which is the head of an arc that crosses from left to right.

And there's yet another vertex down here in v minus x,

which again is the head of an arc which crosses from left to right.

There are two options of which of those two to suck into our set x and

we might want some guidance about which one to pick next.

The key idea in Dijkstra is to give each vertex a score

corresponding to how close that vertex seems to the source vertex s, and

then to pick among all candidate vertices, the one that has the minimum score.

Let me be more precise.

17:15

For example, in the cartoon in the bottom left,

maybe of the two edges crossing from left to right, maybe the top one

is the one that has a smaller value of Dijkstra's greedy criterion.

In that case, this would be the vertex v star and

the other end of the edge would be the vertex w star.

This edge, v star, w star is going to do wonders for us.

It will both guide us to the vertex that we should add the x next,

that's going to be w star.

It's going to tell us how we should compute the shortest path distance to w

star, as well as what the actual shortest path from s to w star is.

Specifically, in this iteration of the wild loop, after we've chosen this edge v

star w star, we add w star to x.

18:22

What we're going to do is we're going to set the r estimate

of w star's shortest path distance from s to be equal to

the value of this Dijkstra's greedy criterion for this edge.

That is, whatever our previously computed shortest path distance from s to v star

was plus the length of the direct edge from v star to w star.

Now a key point is to realize that this code does make sense.

By which I mean,

if you think about this quantity A(v), this is been previously computed.

And that's because environ of this algorithm is we've always computed

shortest path distances to everything that is in capital X.

And of course, the same thing holds when we need to assign w star shortest path

distance because v star was a member of capital X,

we had already computed its shortest path distance.

So we can just look up the v star entry position in the array a.

Over in our picture on our left, we would just say,

what did we compute the shortest path distance to v star previously?

Maybe it's something like 17.

And then we'd say, what is the length of this direct edge from v star to w star?

Maybe that's 6.

Then we would just add 17 and 6 and

we would put 23 as our estimate of the shortest path distance from s to w star.

19:36

We do something analogous with the shortest path itself in the array b.

That is, again, we're responsible, since we just added w star to capital X,

we're responsible for suggesting a path from s to w star in the b array.

What we're going to do is we're just going to inherit the previously computed path

to v star and we're just going to tack on the end one extra hop,

namely the direct edge from v star to w star.

That will give us a path from s all the way to w star via v star

as an intermediate pit stop and that is the entirety of Dijkstra's Algorithm.

I've explained all of the ingredients about how it works at a conceptual level.

The two things I argue is, why is it correct?

Why does it actually compute shortest paths directly to all of the different

vertices, and then secondly, how fast can we implement it?

The next two videos are going to answer both of those questions but

before we do that, let's go through an example to get a better feel for

how this algorithm actually works.

I also want to go through a non example so

that you can appreciate how it breaks down when there are negative edges, and

that'll make it clear why do we need a proof of correctness because it's not

correct without any assumptions about the edge lengths.