The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

En provenance du cours de Stanford University

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

485 notes

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

À partir de la leçon

Week 1

Two motivating applications; selected review; introduction to greedy algorithms; a scheduling application; Prim's MST algorithm.

- Tim RoughgardenProfessor

Computer Science

Today, we're going to embark on the discussion of a new algorithm design

paradigm. Namely, that of designing and analyzing

greedy algorithms. So to put this study of greedy algorithms

in a little bit of context, let's just zoom out.

Let's both review some of the algorithm design paradigms that we've already seen,

as well as look forward to some that we're going to learn later on, in this

course. So it's sort of a sad fact of life that

in algorithm design, there's no one silver bullet.

There's no magic potion that's the cure for all your computational problems.

So instead, the best we can do, and the focus of these courses, is to discuss

general techniques that apply to lots of different problems that arise and lots of

different domains. So that's what I mean by algorithm design

paradigms. High level problem solving strategies

that cut across multiple applications. So let's look at some examples.

Back in part one, we started with the divide and conquer algorithm design

paradigm. A canonical example of that paradigm

being the merge sort algorithm. So remember in divide and conquer what

you do is you take your problem you break it into smaller sub problems, you solve

the sub problems recursively and then you combine the results into a solution for

the original problem. Like how in merge sort you recursively

sort two sub arrays and then merge the results to get a sorted version of the

original input array. Another paradigm that we touched on in

part one, although we didn't discuss it anywhere

near as thoroughly, is that of randomized algorithms.

So the idea that you could have code flip coins,

that is, make random choices, inside the code itself.

Often, this leads to simpler, more practical, or more elegant algorithms.

A canonical application here is the quick sort algorithm using a random pivot

element. But we also saw applications for example,

to the design of hash functions. So the next measure paradigm we're going

to discuss is that of greedy algorithms. So these are algorithms that iteratively

make myopic decisions. In fact, we've already seen an example of

a greedy algorithm in part one namely Dijkstra's shortest path algorithm.

And then, the final paradigm we're going to discuss in this class is that of

dynamic programming, a very powerful paradigm which solves, in

particular, two of the motivating questions we saw earlier namely, sequence

alignment, and distributed shortest paths.

So what is a greedy algorithm anyways? Well, to be honest, I'm not going to

offer you a formal definition. In fact, much blood and ink has been

spilled over which algorithm is precisely greedy algorithms.

But, I'll give you a sort of informal description.

A sort of rule of thumb for what greedy algorithms usually look like.

Generally speaking, what a greedy algorithm does, is make a sequence of

decisions with each decision being made myopically.

That is, it seems like a good idea at the time and then you hope that everything

works out at the end. The best way to get a feel for greedy

algorithms is to see examples and the upcoming lectures will give you a number

of them. But I want to point out we've actually

already seen an example of a greedy algorithm in part one of this course,

namely Dijkstra's shortest path algorithm.

So in what sense is Dijkstra's algorithm a greedy algorithm?

Well if you recall the psuedo code for Dijkstra's algorithm, you'll recall

there's one main wild loop and the algorithm process's exactly one new

destination vertex in each iteration of this wild loop, so there's exactly N - 1

iterations overall, where N is the number of vertices.

So the algorithm only gets one shot to compute the shortest path to a given

destination. It never goes back and revisits the decision, in that sense the

decisions are myoptic, irrevocable and that's the sense in which Dijkstra's

algorithm is greedy. So let me pause for a moment to discuss

the greedy algorithm design paradigm generally.

Probably this discussion will seem a little abstract so I recommend you

revisit this discussion on the slide after we've seen a few examples so at

that point I think it will really hit home.

So let me proceed by comparing it and contrasting it to the paradigm we've

already studied in depth. That of divide and conquer algorithms.

So you'll recall that in a divide and conquer algorithm what you do is, you

break the problem into sub-problems. So, maybe you take an input array and you

split it into two sub-arrays. Then you solve the smaller sub-problems

recursively, and then you combine the results of the sub-problems into a

solution to the original input. So the greedy paradigm is quite different

in several respects. First, both a strength and a weakness of

the greedy algorithm design paradigm is just how easy it is to apply.

So it's often quite easy to come up with plausible greedy algorithms for a

problem, even multiple difference plausible greedy algorithms.

I think that a point of contrast with divide and conquer algorithms.

Often it's tricky to come up with a plausible divide and conquer algorithm,

and usually you have this eureka moment where you finally figure out how to

decompose the problem in the right way. And once you have the eureka moment,

you're good to go. So secondly, I'm happy to report that

analyzing running time of greedy algorithms will generally be much easier

than it was with divide and conquer algorithms.

For divide and conquer algorithms it was really unclear whether they were fast or

slow, because we had to understand the running time over multiple levels of

recursion. On the one hand problems were size was

getting smaller, but on the other hand, the number of some problems was

proliferating. So we had to work hard, we developed

these powerful tools like the master method, and some other techniques, for

figuring out just how fast an algorithm like Merge Sort runs, or just how fast an

algorithm like Strassen's fast matrix multiplication algorithm runs.

In contrast with greedy algorithms, it will often be a one liner.

Often it will be clear that the work is dominated by say, a sorting sub routine

and of course we all know that sorting takes n log and time if you use a

sensible algorithm for it. Now the catch,

and this is the third point of comparison,

is we're generally going to have to work much harder to understand correctness

issues of greedy algorithms. For divide-and-conquer algorithms we

didn't talk much about correctness. It was generally a pretty straightforward

induction proof. You can review the lectures on Quicksort

if you want an example of one of those canonical inductive correctness proofs.

But the game totally changes with greedy algorithms.

In fact, given a greedy algorithm we often won't even have very good intuition

for whether or not they are correct. Let alone how to prove they're correct.

So even with a correct algorithm, it's often hard to figure out, why it's

correct. And in fact, if you remember only one

thing from all of this greedy algorithm discussion many years from now,

I hope one key thing you remember is they're often not correct.

Often, especially if it's one you proposed yourself which you're very

biased, in favor of. You will think the algorithm, the greedy

algorithm must be correct because it's so natural.

But many of them are not, so keep that in mind.

So to give you some immediate practice with the ubiquitous incorrectness of

natural algorithm. Let's review a point that we already

covered in part one of this class concerning Dijkstra's algorithm.

Now, in part one we made a big deal of what a justly famous algorithm Dijkstra's

shortest path algorithm is, it runs brazenly fast and it computes all the

shortest paths. What else do you want?

Well remember there was an assumption when we proved that the Dijkstra's

algorithm is correct. We assumed that every edge of the given

network has a non negative length. We did not allow negative edge lengths.

And as we discussed in part one, you know,

for many applications, you only care about non negative edge lengths.

But there are applications where you do want negative edge lengths.

So let's review on this quiz why Dijkstra's is actually incorrect, despite

being so natural. it's incorrect when edges can have

negative lengths. So I've drawn in green, a very simple

shortest path network with three edges and I've annotated the edges with their

links. You'll notice one of those edges does

have a negative length, the edge from V to W with length minus two.

So the question is consider the source vertex S and the destination vertex W.

And the question is, what is the shortest path distance computed by Dijkstra's

algorithm and you may have to go and review just a pseudo code in part one or

on the web. to answer that part of the question and

then what is in fact the actual shortest path distance from S to W where as usual

the length of a path is just the sum of the lengths of the edges in the path.

All right, so the correct answer is D. So let's start with the second part of

the question, what is the actual length of a shortest path from S to W when

there's only two paths at all in the graph?

The one straight from S to W that has length 2, and the one that goes by the

intermediate point V that has length 3 + -21, = 1 which is shorter.

So, SVW is the shortest path that has length 1. Why is Dijkstra incorrect? Well

if you go back to the pseudo code of Dijkstra, you'll see that in the very

first iteration it will greedily find the closest vertex to S in that case this is

W, W is closer then V. It will greedily compute the shortest path distance to W

knowing the information it has right now and all it knows is there's this one hot

path from S to W, so it will irrevocably commute to the shortest path distance

from S to W as 2. Never reconsidering that decision later.

So Dijkstra will terminate with the incorrect output that the shortest path

link from S to W is 2. This doesn't contradict anything we

proved in part one, because we established correctness of Dijkstra only

under the assumption that all edge links are non-negative, an assumption which is

violated in this particular example. But again, the takeaway point here is

that, you know, it's easy to write down a greedy algorithm, especially if you came

up with it yourself. You probably believe deep in your heart

that it's got to be correct all the time, but more often than not, probably your

greedy heuristic is nothing more than a heuristic.

And there will be instances in which it does the wrong thing.

So keep that in mind in greedy algorithm design.

So now that my conscience is clear, having warned you about the perils of

greedy algorithm design, let's turn to proofs of correctness.

That is if you have a greedy algorithm that is correct.

And we will see some notable examples in the coming lectures.

How would you actually establish that effect?

Or if you have a greedy algorithm, and you don't know whether or not it is

correct, how would you approach trying to

understand which one it is, whether it's correct or not?

So let me level with you. Proving greedy algorithm is correct.

Frankly, is sort of, more art than science.

So, unlike the divide and conquer paradigm, where everything was somewhat

formulaic. We had these black box ways of evaluating

recurrences. We had this sort of, template for proving

algorithms correct. Really, proving correctness of greedy

algorithms takes a lot of creativity. And it has a bit of an ad hoc flavor.

That said, as usual, to the extent that they are recurring themes.

That is what I will spend our time together emphasizing.

So let me tell you just again about very high level.

How you might go about this. You, again, might want to revisit this

context aft-, content after you've seen some examples where I think it'll make a

lot more sense. So method one is our old friend or

perhaps nemesis depending on your disposition, namely proofs by induction.

Now for a greedy algorithms remember what they do, they sequentially make a bunch

of irrevocable decisions, so here the induction is going to be on decisions

made by the algorithm. And if you go back to our proof of

correctness of Dijkstra's algorithm, that in fact is exactly how we proved

Dijkstra's algorithm correct. It was by induction of the number of

iterations, in each iteration of the main wild loop.

Computed the shortest path to one new destination.

And we always proof that assuming all of our previous computations were correct,

that's the inductive hypothesis. Then so is the computation in the current

iteration. And so then by induction, everything the

algorithm ever does is correct. So that's a greedy

proof by induction that a greedy algorithm can be correct.

And we might see some more examples of those in, for other algorithms in the

lectures to come. Some of the text books call this method

of proof greedy stays ahead, meaning you always proof greedy's doing

the right thing iteration by iteration. So a second approach to approving the

correctness of greedy algorithms which works in a lot of cases is what's called

an exchange argument. So you haven't yet seen any examples of

exchange arguments in this class so I can't really tell you what they are but

that's what we're going to proceed next. I'm going to argue by an exchange

argument that a couple of difference famous greedy algorithms are in fact

corrected. It has a couple of different flavors one

flavor is to approach it by contradiction.

You assume for contradiction that a greedy algorithm is incorrect and then

you show that you can take an optimal solution and exchange two elements of

that optimal solution and get something even better which of course contradicts

the assumption that you started with an optimal solution.

a different flavor would be to gradually exchange an optimal solution into the one

output by a greedy algorithm without making the solution any worse.

That would show that the output of the greedy algorithm is in fact optimal.

And formally that's done by an induction on the number of exchanges required to

transfer an optimum solution into yours. And finally, I've already said it once,

but let me say it again, there's not a whole lot of formula behind proving

greedy algorithms correct, you often have to be quite creative, you might have to

stitch together aspects of method one and method two, you might have to do

something completely different. Really, any rigorous proof is fair game.

Coursera propose un accès universel à la meilleure formation au monde,
en partenariat avec des universités et des organisations du plus haut niveau, pour proposer des cours en ligne.