In this video, we'll learn about the channel coding theorem. And we'll also discuss the capacity of real Gaussian channel. Finally, we will extend the capacity result to complex Gaussian channel. To start with we'll discuss about the capacity of wireless channel. Marconi in 1901 invented wireless communication and Shannon in 1948 invented information theory. Information theory tells us that every channel has some capacity. Shannon discussed about the channel coding theorem which states that: all rates below capacity C are achievable. If you want to rewrite it, we can say that for every rate R less than equal to C, there exists some code with zero probability of error. So what it means is, we can always transmit at a rate R less than the channel capacity without any probability of error, okay? Like I said, in other words, to achieve almost zero probability of error, one must transmit with rate R less than equal to C. This is a very important formula that has been developed by or suggested by Shannon. Using this, let us start by calculating the capacity of a real Gaussian channel, okay? We're talking about wired channel first of all. The wired channel can be represented as follows. Real input X, real noise we are adding, which are i.i.d with redistribution Gaussian 0 comma sigma n square. Remember this is not complex Gaussian because we're considering real noise. So the received signal will be nothing but the sum of both, right? Because it's a wired channel. X+N, we can always assume the noise and the input signal are independent with the mean square value of the transmit signal limited by power, P, [unclear] value of X is less than equal to P. In this case using the information theory, we can write capacity as the maximum of the mutual information. This is nothing but the mutual information between the input and output, subject to the density function of the transmit signal, X. Now using the formula of the mutual information, in terms of the entropy, we can write as the second equality which is equal to maximum H(Y)-H(Y/X) subject to fx(x). But we know this fx(x) I said, is a probability density function of input X. Now the mutual information given by H(Y)- H(Y/X) can be simplified as H(Y) minus- by substituting the value of Y as X+N explanation given X right? We can substitute this value. If you see this given we know the condition given X, we know X, X plus N if you open this bracket X given X is of no use, right? Once it is known plus N/X, so we know N and X are independent. Independent we can just write, N, no added advantage if I know X if N is independent of X. So, therefore, we can just write H(N/X) equal to H(N). Therefore, the mutual information finally can be simplified as H(Y)- H(N). Now our aim is to calculate the mutual information for Y and N, this is noise and this Y is the received signal. Now, I'm using a standard result which says that for a real Gaussian random variable entropy is given by ½ log_2(2 pi e variance). And here because I'm writing about the noise, the entropy of noise and to write the variance of the noise signal which is nothing but sigma square. So, 2 pi e into variants of the underlying parameters, where the underlying parameter is noise, okay? Therefore, in this case it is the noise variance. Similarly, so we now know the entropy of H, so we only need to calculate the entropy of Y, okay. Before that, what I'm going to do is I need to calculate the variance of the received signal. For that what I'm doing is, I'm calculating the mean square value which is equal to after substituting the value of X, we will get this equality. After opening this, this is very simple because all X and N are real. The formula we're using is A plus B, whole square, A square plus B squared plus 2AB, simple. And this can be further simplified because we know X and N are independent, and both have zero mean. So we know noises, zero mean, I'm not discussing whether X has zero mean or not, but I know noise has zero mean. So, just by expectation of X into expectation of N, which means noise has zero mean. So, this expression can be simplified as p plus sigma n square because we know X has value P and noise has value sigma a square, I hope this is clear, right. This we have simply fired like this. Therefore, now what we're doing is we are simplifying the entropy of I as follows. One more important thing that I missed out here is that this equality that we are seeing is because the underlying parameter is Gaussian. Okay, this equality is achieved only and only when this underlying parameter is Gaussian. If this is not Gaussian, let’s say if Y, we don't know if I'm not talking about what is Y, right? Because Y is the sum of two signals X plus n. n is Gaussian we know. We don't know what is the distribution of transmit signal, X. I can't talk about the distribution of X right now, right. Therefore, I don't know what is the distribution of Y. So, in that case the entropy is always bounded by this expression. Okay, remember so like what I am saying this statement among all random variable, Gaussian random variable has maximum entropy. So this equality statute only when Y is questioned or only when the underlying parameter that we are having is Gaussian. This is an important point. Now using above result, I can simplify the information as follows. After securing H(Y) that's why that's why we have got this less than equal to minus this H(N). We have no and. So we've got the mutual information which is given by this and therefore based on this result we can calculate the capacity. We have just substituted these values and we are getting this final expression. The capacity is equal to 1/2log_2 (1+P/sigma n square) , okay. And this equality is achieved only and only when, remember, Y is having the maximum value and the entropy of Y is having maximum value and when we'll have an entropy of Y being the maximum value? When in our example, X is also Gaussian because from the theory of random variable, we know when we add two Gaussian random variables, the resultant is also Gaussian, so we know N is Gaussian and if in case when the transmit signal is also Gaussian, in that case the sum of two Gaussian will do a Gaussian and if Y is Gaussian, in that case the entropy achieved will be maximum. Therefore, we can achieve this equality. Otherwise this equality will not be achieved. Okay, remember this point. Like I said for Y to be Gaussian, X would be Gaussian. Now let us extend this result to a Gaussian random vector. Till now we were using a simple scalar value, right. Now we have this Gaussian and a vector. In that case, remember, the entropy is given by 1/2log base 2, 2 pi e to power covariance matrix now, because now X is not scalar. For scalar, we have variants, for vector we have covariants. Okay, now if you see 2 pi e to the power n, this symbol represents a determinant of the covariance matrix of X, okay. So this I have just extended project the result for the entropy when we have a vector. Now let us try to find out the capacity of AWGN channel. We now know what is the capacity of real AWGN channel, right? Which is given by ½ log base 2(1 plus P upon sigma square). And this is only when our input signal is also Gaussian. Remember that. And the unit is bits per real dimension. Similarly, if I want to write about what is the capacity of complex AWGN channel, it will be nothing but bits per complex dimension it will be, because we have two dimensions right: real dimension and imaginary dimension. I can have just twice of this. So, therefore this half goes off and we have left with log base 2 (1 plus P upon sigma square). There's a capacity per complex dimension, remember. Now this one important theorem known as Landau-Pollak Theorem which says that with bandwidth, W, there are W such complex dimensions per second and now because we have W such complex dimensions, what we'll have, we can write this is for complex, we have W complex dimension, what will happen? W into this, so therefore we have the overall capacity with bandwidth W is given by W into log base 2(one plus P upon sigma square). Now, the unit will be bits per second because this was complex dimensions per second. If I multiplied with complex dimension per second, complex dimension gets multiplied, We are just left with bits per second, okay. If someone asked what is the capacity of AWGN channel with bandwidth W, you need to tell this. Capacity is equal to W log base 2one plus one P upon sigma square with unit bits per second. Now, if you see this, we have the capacity formula W log base 2 (1 plus + P one sigma square). Let us discuss more about this noise variance. Okay, assume the power spectral density for real dimension is given by N0/2. Therefore, using the power spectral density, I can calculate the noise variance as the power spectral density multiplied by the bandwidth, right? Because power density is flat and not by two, I'm just multiplying with the overall bandwidth because the noise variance or power. So we get N0 x W/2 Now per complex dimension, similarly we can write as N0 x W. So, therefore, after substituting the noise variance in terms of the N0 and W0, what we're getting is the capacity is equal to w log base2(1 plus P upon N0 w) with unit bits per second. So what exactly we are saying is that the capacity is a function of bandwidth here W and also in terms of noise. Remember that people forget this fact. Now if I want to calculate the channel capacity with unit bandwidth, what will be the unit bandwidth? Capacity is equal to log base 2(1 plus P upon N0) This is the channel capacity with unit bandwidth, okay? With unit bits per second per Hertz because I'm considering unit bandwidth, that's why I have bits per second per Hertz and this formula is also known as spectral efficiency. In this video we learnt that to achieve almost zero probability of error, one must transmit with rate less than or equal to the capacity of channel R less than equal to C. Then we learned that among all random variables Gaussian random variable has maximum entropy. Therefore, the maximum capacity is achieved when transmit signal follows Gaussian distribution. In that case the received signal is also following the Gaussian distribution. Finally, we discussed about the capacity which is measured in terms of bits per second per Hertz or in terms of bits per second.