[MUSIC] We have an algorithm for set cover and we have see that it outputs a collection of sets whose expected causes at most OPT. Now we need to see whether or not it is a set cover. Is it correct? Is the output a cover? Maybe, maybe not. It depends on our random choices. So let us look at this property more closely. Let us analyze in particular the number of elements covered. The number of elements covered by the sets that are chosen and placed in the output, can be written as the sum over every element of the indicated function, which is 1 if element e is covered, and 0 if not. This is a random quantity. So again, we are going to analyze it in expectation. What is this sum on average? Compute the expectation of the sum. Expectation of the sum. What do you do when you see an expectation of a sum? You invert, summation and expectation. Sum of the expectation of an indicated function. What do you do when you have the expectation of an indicated function? Replace it by a probability. So what do you get? By the same reasoning as what we saw earlier, the expected number of elements covered is, the sum over every element of the probability that element e is covered by the output collection of sets. So, now to analyze this, we focus on one particular term. Focus on one particular element e. What is the probability that e is covered by the output collection of sets? This is the more technical part of this chapter I have to say. So hang on. When is element e covered? It's covered if there is a set in the output that covers element e. Such as e belongs to S. The probability that e is covered equals the probability that there exists S in the output containing e. And so, what is this? The probability that there exists something is 1 minus the probability that it does not exist. That is, the probability that none of the sets containing e are chosen in the output. So the probability that there exists an s in the output containing e is 1- the probability that for every set containing e, S is not chosen, is not part of the output collection of sets. So, now we need to analyze this probability. We need to upperbound the probability that for every set S containing e, S is not part of the output collection of sets. So far elementary reasoning. Now we're going to use independence. We get to use a property that only holds when you have an independent collection of events. If you have two independent events A and B, then the probability that both of them occur is equal to the product of the probabilities. If they are not independent, this probability of A and B could be less, could be more. Depends on whether they're correlated, whether they're anti-correlated. But if A and B are independent, then this probability is exactly the product of the single probabilities. So, what does the algorithm do? The algorithm for every S, puts S in the output with probability x sub S. And these probabilities are independent, independent for every S. They're independent, so the probability that all of these events occur simultaneously, that every S containing e is discarded, is the product, is equal to the product of the probabilities. So it's equal to the product over every set containing e, of the probability of that set is not put in the output. This thing the key thing here is independence. And so, now we have to look at the probability that S is not in the output. That probability we know is 1- x sub s by the choice of algorithm. Now we just put everything together and we get, the probability that element e is covered is 1 minus the product over every set containing e of 1 minus x sub S. This gives us an expression for the probability that e is covered by the output collection of sets. So, now we have to go back and see how we incorporate that into the analysis of the number elements covered by the algorithm. Now let's see. We need to do a little bit more work on this product of 1- x sub s. The first thing we will do is x is the expectation of log of x. That's an identity. ln of (XY) = log of X + ln Y. That's almost the definition of a logarithm. Apply this to this product. This expression is the exponent of a log of this expression. It's a product, so we can replace it by a sum. Exchange product and logarithm after changing it in a sum. This is the exponential of sum over every set containing element e are the natural logarithm of 1- x s. This is the beginning of an algebraic calculation. The next thing we will use is the log of 1- x sub s can be replaced, simplified, replaced by -x sub s. It is well known that log of 1- x is at most -x. And so we do this and what do we get? We get that this exponential with that expression there is at most e to the minus the sum over every s containing the element of x sub s. Now we look at this. The sum over every e containing S of x sub s, and what do we recognize? We recognize a constraint of our linear program. We have the constraint that for every element e, if we look at the sum of x sub s, over all sets S containing e that sum must be at least 1. Apply this and we deduce the exponential of minus that sum is at most the exponential of -1. So what does this tell us? It tells us that the probability that an element e is covered is at least 1- 1/e. Therefore, the average number of elements covered by the output collection of sets is in expectation at least the total number of elements times 1- 1/e. And what is 1-1/e? It's approximately 63%, a little bit more than 63%, 63.2%. And so what have we proved at this point? We have proved that we've randomized rounding. We have obtained a collection of sets whose cost is in expectation at most OPT. And that cover, how many of the elements? On average, at least 63% of the elements. So, we do not have a set cover. We do not yet have a set cover, but we are getting closer. It only remains to modify the algorithm a little bit to turn it into a correct set cover algorithm.