So, let's speak a little bit about how hard it is to compute a Nash equilibrium in a normal form game. So, let's, let's start with a little history. John von Neumann, one of the founders of modern game theory, when he investigated zero sum game it proved the, the existence of equilibrium there. And he uses what's known as the Brouwer fixed point theorem for that. and that led directly to algorithms for computing fixed points in such in such linear programs. first those Danzig's Algorithm that the real equivalent to what in modern day is called LP duality. And it's an exponential procedure although it practice to use widely of note is the Khachiyan polynomial-time approach to solving linear programs. Although in truth in practice it's, it's not used as widely. It's a practical procedure. And when you go beyond the zero sum games so when John Nash approved the existence of equilibrium for general-sum games, he used the same fixed point theorem of Brouwer. And that also informed a series of algorithms and they're noted there on the slide. And we will be looking at two of them, the Lemke-Howson algorithm and a much more recent algorithm due to Ryan Porter and others. I will note that all of these are exponential in the worse case and I'll get back to that later. So, let's start with the Lemke-Howson algorithm. And let's start with a a formulation of the Nash equilibrium for 2-player games. It looks, it looks as follows. and this is a busy slide but I'll walk you through it and all will become become clear. At heart are two sets of variables, the s's and the r's. The s's will denote the will capture the mixed strategy used by the two players, player 1 and player 2. s1, for example, of s, s2k for example, would be the rate or the probability with which player 2 plays action k in in its mixed strategy. So, the s1's and the s2's are the capture the mixed strategy of the two players, player 1, player 2. r is, are what they call the slack variables. And to understand their roles, let's look at, for example, this equation right there. So, this applies to any action of player 1. For any action of player 1, we look at the value that it the, the value that it give with respect to the strategy of the of the other player. And so, we look at all the actions available to player 2. We look at the pay-off to player 1 given that he is playing a particular action, j, and looking at that action of player of the other two and normalizing it by the probability that player 2 attaches to that strategy a2. And so, if we look at this sum as a whole, this is the expected pay-off for player 1 when playing strategy j given that the player 2 is playing a certain mixed strategy s2. And, it is what it is. And, in general if you look at all the actions that the player player 1 plays they will give different pay-offs. What we want is for player 1 to best respond to that strategy of player 2 because in equilibrium, every player is best responding to the other. And so, let's call U*1 to be the pay-off to player 2 to player 1 in the, in the Nash equilibrium. So, in general the pay-off for player 1 when they play aj will be no greater than the best respond in the list. So, we're going to add this slack variable, as it's called, that will say, is this is how much player 1 is missing relative to their best response when they're playing strategy j. Those are the slack variables. And, so now, that will also give us the sense for this condition here. So, the slack variables are always non-negative. And in a Nash equilibrium, they will be exactly zero, except when you speak about strategies that are actually played with zero probability by the player. So, now we talk about the s's. s's, as we said, speak about the weight and the mix that each player gives to their each of their actions in the each strategy they play. And so, when player 1 plays strategy j with non-zero probability, it better be the case that is better, best responding namely that the slight variable is zero. And when they're playing with zero probability, you don't care what the factor variable is because they're not playing that strategy at all. And you capture that by requiring that the product to be zero. This is exactly the condition, and this is what makes it a linear complementarity problem. So, I hope that's clear and you can see now similarly for player 2. For player 2, we take each of their actions and we say, if they're going to play it with none, with, with some probability then, and we look at their best response here given whatever player 1 is going to play their mixed strategy. And we're going to look at the slack variable here, and again, we're going to require that the product be zero. In other words the probability that they play j is nonzero just in case the slack variable is zero, okay? So this is the nature of this of this mathematical optimization program. And, of course I forgot to mention, but of course, we want the, the s's to be a probability distribution. Though, they sum to one and they are all nonzero. Alright. So, this is our linear complementarity program. And now come Lemke-Howson who suggest to find a solution in a particular way. And we won't go over it, but the flavor of it is to initialize the s's and the r's a particular way. In fact, to artificially initialize them all to zero, and then one by one take them, it's called a pivoting procedure. Take the, an, an s and an r in turn alternating between the two taking them out of the set that has the current value and replacing it with a complementary variable. If it's an r, replacing it with an s. And if it's an s, replacing it with an r, until you get a, an equilibrium. That's the general flavor of it and in, in this lecture we won't go in more detail into the Lemke-Howson procedure. But it is a procedure that looks very deeply at what a Nash equilibrium is. Sets it up as a mathematical program, and then searches the space of variables in an informed way. Let's now look at a very different procedure. One that doesn't look in as much detail as the structure of equlibria but compensates by by performing uristic search in a certain. So so let's look at it and we'll look at it at at two stages. The first step is to note that when you fix the supports of a strategy profile, finding out whether there is a Nash equilibrium with that support is an easy problem. Remember that the support of a strategy is consists of all the actions to which the players is giving nonzero probability in that mixed strategy. So, let's look at this formulation. Let's look, and this will be limited to two players. and so let's look at all players in turn, for example, player 1, and let's look at every action of that player, for example, a sub i. We'll be looking for some mix strategy, p, mix strategy profile for one for each of the players that will give us a Nash equilibrium, namely each will be best responding. And so, for all actions in the support that we're considering, we'd want the agent to be best responding. So, let's assume that the best response value is v sub i, just call it that number. Then, we want ai to in fact to be a best response to the rest. And what we want is all other actions, and the other action, none is support, to give us a value that's no greater than the best response. And, we want it for each, each of the two players and each of their actions in the support, so that makes sense. And we want these to be a probability so we want the probabilities in the support to be nonzero. We want the probabilities outside the support to be zero, and we want it indeed to be a probability distribution. This all makes sense. So, this is a linear program. It's solved only in polynomial time. that is theoretically there is a polynomial time procedure. in fact, these procedures used are not polynomial of the worst case but but nonetheless effective. By the way, notice that we actually did use the assumption that we're fixing the support here. Superficially, you might look at it and say, oh Â·Â , I could do the same thing but simply ignore the support part. Where, where are we using that really? Well, we're using it in the assumumption that when we're best responding inside and don't have a profit, proper abbreviation. we're actually playing these pi's with probability with a positive probability. Because if we, playing the remaining strategy with zero probability, in fact doesn't matter if we're best responding to it or not. And so, this is where this assumption is hidden. So, now we know that when we fix the support we can solve the question efficiently. the [UNKNOWN] is the fact that there's an exponential number support to explore. And this is the second part, we need to simply now start the exploring the the sets of support. And, I won't go into details about how we do it. But the basic idea is the following. We will bias the support to supports that are close in size to one another, that is we will not start by considering one agent looking at only two strategies among which is randomizing, and the other agent looking at 17 strategies. We'll look at a strategic profile of the similar whose support is similar in size. We also start with small supports and gradualling with the larger supports. If we do that and we involve, and we, we use another trick called conditional domination that is look at certain thing we can ignore along the way, then what turns out that although the procedure is in the worst case exponential as is the Lemke-Howson in fact it performs in practice quite well. And in fact better than essentially all other procedures tried including the the Lemke-Howson. This procedures do have exponential worst case and so the question is, can we do better? Are there procedures that are less than exponential in the worst case? And that takes us from the realm of complex algorithms to the realm of complex analysis. So, let's first remind ourselves a little bit about what complexity analysis looks like. We're looking at classes, whole classes of problems such as the class of old games, and the problem of determining a a sample Nash equilibrium of those games. And we're looking at the how hard is that class as a whole. And so, here are, it's a small part of the complexity hierarchy. The class P as it's known is the class of problem for which a polynomial kind of solution is, is known. The class NP is the class of problems for which a class a solution can be verified in polynomial time, but not necessarily found in polynomial time. The class NP-complete is the hardest among all the NP classes, that is the classes to which all NP problems can be reduced. And perhaps, the biggest answer problem in theoretical computer science is a question of whether NP is different from P, widely believed to be but has not been proved. So, this is the general background we need to keep in mind. And now, we can ask, well, where does where does the problem finding a Nash equilibrium reside in P, in NP? What can we say? Well, first of all, strictly speaking, we can't quite speak about it being in P or NP because we know from Nash's theorem that a Nash equilibrium always exists. So, the question, does it exist in Nash equilibrium is trivial, the answer is yes. So, we need to look at it a little differently. One way to look at differently is to ask for Nash equilibrium with specific kind of properties. So, for example, we can say does it have unique Nash equilibrium? Or does a, existent of equilibrium that's strictly Pareto efficient? Or does, is there a Nash equilibrium that guarantees a given player some minimum pay-off? Or can we guarantee some minimum, some minimum social welfare in a Nash equilibrium? Does the existent Nash equilibrium that include, that includes some, fewer, fewer strategies, some action of the player in it? Or conversely, that does exclude it? All of these and others, are examples of questions that are provably NP NP hard. Okay. So, that tells us something about the hardness. But still we still ask about just finding a sample in Nash equilibrium. How hard is that? So, we've seen the algorithm. People have tried very hard to find algorithms computing a sample in Nash equilibrium. and it does seem hard. The question is, can we somehow capture that formally within the complexity hierarchy? And and and for that, we need to introduce some new node, new new concepts. the essential concept is that of the new class of problems called PPAD for Polynomial Parity Arguments on Directed graphs introduced by Christophe Papadimitriou in 1994. we won't go into detail, but just so you know the chronology. PPAD is a specialization of the class called TFNP, which is in turn was a specialization of a problem called FNP. going detail here is, is, is beyond the scope of, of, of what we want to speak about. But it does help us now position the complexity of finding a sample Nash equilibrium in the complexity hierarchy. And again, we have the class of polynomial time problems, of problem that can be verified in polynomial time with these being the hardest among them. And given that, PPAD turns out to reside somewhere within this class. Now again, we don't know whether this entire class does not collapse and all become one and the same. It's why they believe that it does not but proof doesn't exist. However we do know that PPAD lies someplace in between P and NP. Now what does that have to do with the problem computing a Nash equilibrium. Well, that's where the, the following theorems come in. originally, it was shown that the problem of computing a Nash equilibrium is complete for this class PPAD. That is, it's the hardest among all problems in that class, initially proved for four players than for all, for games with three or more players. And then, finally, in '06 for all, all, all classes game. And so, we widely believe that the problem is not polynomial, cannot prove it but we do know where it reside and within the complex hierarchy that we are familiar with.