The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

From the course by Stanford University

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

247 ratings

Stanford University

247 ratings

Course 3 of 4 in the Specialization Algorithms

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

From the lesson

Week 3

Huffman codes; introduction to dynamic programming.

- Tim RoughgardenProfessor

Computer Science

So the time has arrived to begin our study of dynamic programming. So this is a general algorithm design paradigm. As I mentioned at the beginning of the course, it has a number of justly famous applications. However, I'm not going to tell you just yet what it is that makes an algorithm dynamic programming. Rather, our plan is, over the next few videos, to develop from scratch an algorithm for a non trivial, concrete computational problem. The problem is finding the maximum weight independent set and a path graph. This concrete problem is going to force us to develop a number of new ideas. And, once we've finished solving the problem, at that point, we'll zoom out, and I'll point out what are the characteris-, characteristics of our solution which make it a dynamic programming algorithm. Then, armed with both a sort of formula for developing the dynamic programming algorithms, as well as a concrete instantiation, we'll move onto a number of further, and in general harder applications of the paradigm. Indeed, even more than usual, the dynamic programming paradigm takes practice to perfect. In my experience, students find it counterintuitive at first, and they often struggle to apply the paradigm to problems that they haven's seen before. But here's the good news, dynamic programming is relatively formulaic, certainly more so than our recent study of greedy algorithms, and it's something that you can get a hang of. So with sufficient practice, and that's exactly what I'll be giving you in the next couple of weeks, you should find yourself with a powerful and quite widely applicable new tool in your programmer tool box. So let me introduce you to the concave, concrete problem we're going to study over the next few videos. It's a graph problem but a very simple graph problem. In fact, we're going to restrict our attention merely to path graphs. That's graphs that consist solely of a path on some number n of vertices. The only other part of the input is a single non-negative number per vertex. We're going to call these weights. For example here's a path graph on four vertices, and let's give the vertices the weights one, four, five, and four. The responsibility of the algorithm is going to be to output an independent set. What that means is a subset of the graph's vertices, so that no two vertices are adjacent. So in the context of a simple path graph, it just means you gotta return some of the vertices and always avoiding consecutive pairs of vertices. So when you have a path of four vertices, examples of independent sets include the empty set, any single vertex, vertices one and three, vertices two and four, and vertices one and four. You could not, for example, return vertices two and three. Because those were adjacent. That is forbidden. Now, to make this interesting, we're going to want just, not any old independent set, but the one whose sum of vertex weights is as large as possible. That's the max weight independence set problem. So what I'm going to do next is use this concrete problem as an opportunity to review the various algorithm design paradigms that we've seen so far. Along the way we'll see that none of them actually work very well for this problem. And that's going to motivate us to devise a new approach, and that approach is going to turn out to be dynamic programming. . So there's always our standard punching bag, brute force search. This would entail iterating through all of the independent sets and remembering the one with maximum total weights. Of course it's correct, no question about that, but as usual this would require exponential time. Even in just a path graph, the number of independent sets is exponential in the number of vertices, n. .

So what other algorithm design paradigms do we know? Well we just finished a big segment on greedy algorithms, we could certainly think about that. You know, pretty much most problems it's easy to propose greedy algorithms, and this one's no exception . I think the most natural greedy algorithm you might try to use to compute a maximally independent set would be well. What's the myopic decision? Well, you want to get as much weight overall. So, in each step, you want to just pick the vertex with the highest weight that you haven't already chosen. Now, of course you have to worry about feasibility. Remember, we're not allowed to output adjacent or consecutive vertices. So, if any vertex is ruled out by adjacency, we ignore it. And amongst those that preserve feasibility, we include the highest weight one in our set so far. Well, let me redraw the four node path graph we had in the last slide and let me ask you. What would this greedy algorithm compute on the four node path, and how does that compare to the optimal solution, the independent set with the maximum total weight?

So the correct answer is the second one. Let's see why. So let's start with the optimal solution, the maximum independence set. Remember independence sets are forbidden from choosing adjacent or consecutive vertices. so in this case the only sensible solutions to consider are the first and third vertex, the second and fourth or the first and fourth. Of these, the best is the second and fourth for a total of eight. So what about the greedy algorithm? Well, you know, we just had this period of time where we got really spoiled with the success of greedy algorithms. especially the minimum span entry problem. But let me remind you, you know, greedy algorithms, you know, they're often good heuristics. They're often not guaranteed to be correct. And so I'm happy to have this opportunity to quickly remind you again, of that drawback of greedy algorithms. They're quite frequently not correct. So this is another such case, so what will the greedy algorithm do? Well, it begins by picking the max weight vertex over all. So that would be this vertex with weight five. That unfortunately blocks the algorithm from picking either of the two vertices that has weight four. The only remaining option that preserves feasibility is to pick. The vertex of weight one. So that gives us an indepenedent set of weight six. So this greedy algorithm is not correct. You could of course try to devise other types of algorithms, but I don't know of any greedy approach that will actually solve this problem optimally. So that's a bummer, but we still haven't exhausted our algorithm design paradigms. remember, you know, we learned this quite powerful divide and conquer approach early on in part one of this class. And you know, it seems like it could work here. We had all these successful applications where the input was an array. We broke it into two halves. We were cursed on both sides and combined the results. And here, you know, path graphs don't look so different than an array of numbers. So the obvious approach for divide and conquer is to break the path into two paths, each of half the length of the original, recursively compute a maximum weighted independent set of each, and then somehow combine the results. With the issues with the divide and conquer approach are already apparent, just in our simple four vertex example. So if we recurse on the left half, that is the first two vertices, and we compute a max weight independence set, that's just going to be the second vertex by itself. And if we independently recurse on the right hand side, on the vertices three and four, the maximum weight independence set on right half is going to be the vertex of weight five. And now when we, the recursion's complete and we get our sub-solutions back, we have the second vertex and we have third vertex, but the problem is the union of those two solutions conflicts. Right? We cannot simultaneously output the second and third vertices. Those are consecutive, those are adjacent, and that's not allowed. Moreover, you know, in a four note graph. It's sort of easy to see how to repair this conflict. But in a big graph with say thousands of nodes, if you have a conflict right where the two sub-problems meet, it is not at all obvious how you would quickly fix that and get a feasible and optimal solution to the original problem. Now in some sense the divide and conquer paradigm is more powerful or better suited for this problem than the greedy approach, in that I do know of divide and conquer algorithms that could solve this problem optimally that run in quadratic time. But doing better than that in a divide and conquer matter seems quite challenging. And the dynamic programming based algorithm we'll develop will solve the problem in linear time. That's coming up next.

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.