A greedy heuristic for the knapsack problem is pretty good, but let's work a little harder and do quite a bit better. We're going to accomplish a quite ambitious goal. The knapsack problem, being NP-complete, is not we expect going to admit an exact polynomial time algorithm. Let's shoot for the next best thing, arbitrarily close approximation. That is, let's accept from a user error parameter epsilon, and, return the solution whose value is guaranteed to be at least ibe minus epsilon times that of an optimal knapsack solution. So, a client who wants to be guaranteed that this heuristic outputs a solution with a 99% of optimal just has to set epsilon to be 0.01. If this sounds a little too good to be true, there is a catch. The running time is going to increase as we take epsilon to be smaller and smaller. But what this algorithm exports to the client is an accuracy running time trade-off. That is, it exports via this parameter epsilon a knob that the user can turn wherever they want. If you want more accuracy and are willing to wait a little bit longer, you set epsilon to be epsilon to be smaller. If you want something really fast and you're okay with a bit of inaccuracy, you take espsilon to be bigger. And, this is as good as it gets, for Np-complete problems. Assuming P is not equal to NP, there's no way to solve it exactly in polynomial time, the next best thing you can have, hope for, is arbitrarily close approximation, in polynomial time. That is what we will get for the napsack problem using a dynamic programming base heuristic. Sadly, the knapsack problem is the exception that proves the rule. Most NP-complete problems, it is believed you cannot get an arbitrarily close approximation. One concrete example is the vertex cover problem, we discussed exact, expnential time algorithms for vertex cover in a seperate video and we've been talking about polynominal time heuristics for vertex cover, and there are some good ones. You can get within a factor two of an optimum vertex cover in polynomial time, but, it is believed you cannot get arbitrarily close approximation of the optimal vertex cover in polynomial time. In fact, if you came up with such an algorithm, that would imply p equals NP. While there are some other problems out there like knapsack that admit arbitrarily close approximation, there's many more out there which are like vertex cover where, assuming p not equal to NP, it is not possible to get arbitrarily close approximation in polynomial time. But this is an algorithms class, we should stay positive and focus on what you can do. So let's move on to the arbitrarily close approximation for knapsack. So the high-level idea is both very cool and also very natural. What we're going to do is we're going to take the knapsack instance that we're given, so items that have values and weights, and some kapsack capacity. And we're going to massage it a little bit, not too much, just a little bit but we want to put it squarely in one of our computationally tractable special cases, and then, we're going to solve this transformed version of the original input. Now, because we're not solving the problem that we were given, we're not going to get the right answer. But as long as we didn't transform it too much, we can hope that the optimal solution to the transformed problem is a pretty good approximation of the original problem that we were give. Well, what's a computationally tractable special case of the knapsack problem? To date, we do know one, but we only know one so far, which is if we assume that the sizes of the items in the knapsack capactiy are integers, we can go back to our dynamic programing solution for the knapsack problem which runs in time, n, the number of items time capital W, the knapsack capacity. So if it were the case of the knapsack capacity W wasn't too big, if for example, it was polynomial and the number of items, n, then this dynamic programing algorithm would run in polynomial time. For technical reasons, which I'll say a little bit more about later, this computationally tractable special case is not well-suited for deployment in a dynamic programming heuristic. Instead, we're going to want to develop a second similar but different computationally tractable special case. The second special case is when the sizes and the knapsack capacity can be arbitrary, but where the item values are integers that are not too big. This assumption can also be exploited by a dynamic programing algorithm. As we'll show in a separate video, there is a different dynamic programming algorithm for the knapsack problem, whose running time is N squared times the largest item value, N squared times Vmax. So from this second dynamic programming algorithm, we can extract a second computationally tractable special case of the knapsack problem. Namely if all of the item values are small integers, say a polynomial in the number of items n, then this dynamic programming algorithm already gives us a polynomial time algorithm for instances of that form. Armed with the second computationally tractable special case, we can now return to the high-level idea at the beginning of this slide and execute it. Here's how we do it. Our knapsack algorithm is given some knapsack instance, the item values need not be small integers. Now we massage the instance a little bit. We transform it, so that it becomes an instance that we know how to solve in polynomial time. How do we do that? We force the item values to be small integers. How do you take arbitrary numbers and make sure that they're small numbers? Well, the first thing to try is drop the low order bits and that's essentially what we're going to do. Our algorithm then, solves this transformed instance, by construction, the instance has small item value so we can solve it in polynomial time using our second dynamic programing algorithm. Now, we're probably not going to get an optimal solution to the original instance. After all, we solved the wrong instance. Unless we transform the values before invoking our dynamic programming algorithm. But, the hope, and this is just a hope, this is definitely going to require a proof. The hope is that we didn't loose too much information just by throwing out some lower order bits. So, by virtue of computing a solution which is 100% optimal, for the almost correct problem, maybe that solution is also almost optiomal for the fully correct correct problem. So here is the algorithm in full detail. It really only has two steps. Fist, we do the transformation. We round the item values to the small integers, then we invoke the second dynamic programming algorithm on the transformed instance. Precisely, here is how we round each item value v sub i. To begin, we decrease vi to the nearest multiple of a parameter m. I'm going to leave this parameter m unspecified at the moment and therefore this algorithm will be undetermined. Once we start analyzing the algorithm, we'll see what the parameter m has to be as a function of epsilon. So don't forget that epsilon is the accuracy parameter supplied by our client. If the client sets a given value of epsilon, what it means is that our responsibility is to output a feasible solution with value at least one minus epsilon times that of an optimal solution. So the smaller the epsilon, the more accurate our algorithm needs to be. When we round an item value vi down to the nearest multiple of m, we're decreasing it by an amount somewhere between 0 and m. If vi was already a multiple of m, we don't decrease it at all. If vi was just below a multiple of m, then we wind up decreasing it by essentially m. Therefore, the bigger the m, the more information we're throwing out about our instance and the less we should expect our accuracy. That is, the smaller value of epsilon the more accuracy demanded by our client, the smaller we're going to have to take our value of m. Now, once we've made everything a multiple of m, we may as well just scale down, we may as well divide everything by m, we're going to get integers. The integers that we get after rounding down to the nearest multiple of m and then dividing by m are going to be called the V hats. An alternative succinct way to describe the V hats is V hat i is by definition, Vi/m rounded down. And so these funny brackets in blue on the right, that's called the floor operator, that takes in a real number and returns the biggest integer that's less than or equal to the input. Thus, what is the transformation? In effect, we take the item values that were given, we divide them by this parameter m and then we also round down just to make sure we're dealing with integers. Step 2 of our heuristic is to just invoke the second dynamic programming algorithm to solve exactly the transformed instance. That is, we feed into that second dynamic programming algorithm, the instance with N items, the sizes are exactly the same as in the knapsack instance we were given originally. The knapsack capacity, capital W, is exactly the same we were given originally, but we feed in the values, V hat one through V hat n. That is, we solve it with respect to the transformed values, not with respect to the original values, the Vi's. Let's make two quick observations before we conclude. So first of all, remember the dynamic programming algorithm, the second one for knapsack, the running time is n squared times the largest item value, and of course, here we're running the dynamic programming algorithm with these transformed item values the V hats. So therefore, if the V hat are big, then this running time is slow, if the V hats are small, then this running time is going to be reasonable. Now, let's remember the definition of the V hats. They're essentially the original item values divided by this magical parameter m, which we haven't yet specified. But qualitiatively, if m is big, the V hats are going to small and the running time is going to be fast. On the other hand, if m is small, the V hats are going to be big and this algorithm is going to be slow. That's how m controls the running time of this dyanamic programming base heuristic. Also, remember we argued how intutitively it seemed to be controlling the accuracy. The more you jack up m, the more information you lose when you round the Vs and transform them into the V hats, and so, the less accuracy you're going to have. So bigger m means, on the one hand, faster running time, but at the cost of worst accuracy. So this parameter m, really a knob that we can tune to figure out whether we want a faster algorithm or higher accuracy. The non-trivial statement which we need to prove in the next video is that there's a sweet spot for m. That you can simultaneously get a pretty good running time, poloynomial running time and excellent accuracy. So one trivial observation, but that's still really nice, is that this dynamic programming based heuristic is guaranteed to produce a feasible solution. Whatever output of items we get in step 2, it's guaranteed to fit inside the knapsack. The reason for that is we pass to the dynamic programming algorithm in step 2, fully accurate sizes, the original Wi's, and similarly a fully accurate knapsack capacity, capital W. Therefore, the feasible solutions for the transform knapsack instance are exactly the same as they were in the original knapsack instance. This explains why we needed to develop a second dynamic programming algorithm for knapsack, rather than trying to use the the first one we developed. Thankfully there is this second computation retractable special case where the items have small sizes and that turns out to be exactly what you want to make this two-step recipe work. The analysis is coming right up.