[MUSIC] So we've talked in a generic manner about centralization and decentralization. Let's now talk, at a bit more technical level, about Bitcoin and decentralization. And a key word that's gonna come up again and again here is consensus, specifically distributed consensus. So what am I talking about here? At a technical level, the key challenge that you have to solve to build a distributed e-cash system is called distributed consensus. And this is a class of protocols that's been studied for decades in the computer science literature. So, but intuitively, you can think of it as our goal being to decentralize ScroogeCoin, which is the hypothetical currency that we saw in the first lecture. So as I said, there's decades of research in computer science on these consensus protocols. And the traditional motivating application for this is reliability in distributed systems. What do I mean by that? Imagine you're in charge of the backend for a company like Google or Facebook. These companies typically have thousands, or even millions of servers, which form a massive distributive database that records all of the actions that happen on the system, like users' comments and likes and posts, and so on. So when a new comment, let's say, comes in, the way it'll be recorded is that there might be 10 or 15 different nodes, in that massive backend, that might contain copies of this action. Now, what the server needs to make sure is that that comment either gets recorded in all copies of that database, or none of them. If for some reason, because some of these nodes might be faulty, the action gets recorded in none of the databases, it's okay. You can go back to the user and say, there was a problem saving your post, would you please try again? On the other hand, if some of the copies of the database saved it and others didn't, then you'd be in a lot of trouble, because you'd have an inconsistent database. So this is the key problem that motivated the traditional research on distributed consensus. And you can sort of see the similarities to Bitcoin here. But we're gonna talk in a bit more detail about the similarities and differences. So that was the traditional motivating application. But we can also imagine that if we achieved a distributed consensus protocol, and we were able to use that to build a massive, global scale distributed key value store that maps arbitrary keys or names to arbitrary values, then that will enable a lot of applications. For example, a distributed domain name system, which is simply a mapping between human understandable domain names to IP addresses. Or a public key directory, which is a mapping between user email addresses, let's say, to their public keys. Or even things like stock trades. Because this distributed database, instead of keeping track of who's paid whom how much money, would keep track of who's transferred what units of which stock to whom. And the cool thing about this is that now that Bitcoin has solved the distributed consensus problem, in a certain sense that we'll try to understand in this lecture. We can also go ahead and try to think about solutions to all of these other related problems. And in fact, there are many Altcoins. And Altcoins, we'll have several more lectures about Altcoins. But very briefly, Altcoins are systems built on Bitcoin-like principles to achieve, perhaps, slightly different goals. Sometimes currency systems, sometimes not currency systems, such as one of these applications. And so, given that we can solve distributed consensus now. And given that we can build a global distributed key value store. It enables a lot of these other cool applications. Okay, let's go to a technical definition now. The technical definition of distributed consensus is really quite simple. Imagine that there is a fixed number, n, of nodes or processes. And each of these nodes has some input value. And then a consensus protocol happens. And the two requirements on this consensus protocol are that the protocol should terminate, and all correct nodes should decide on some value, the consensus value, all right? And I say correct nodes, because some of the nodes might be faulty, or even outright malicious. And the second requirement is that this value that they agree upon cannot be an arbitrary value. But it should be a value that was proposed as input by at least one of these correct nodes. So, it's really that simple. But let's try to see what this might mean in the context of Bitcoin. So, to understand how distributed consensus could work in Bitcoin, let's start with a reminder that Bitcoin is a peer-to-peer system, all right? So, what I mean when I say Bitcoin is a peer to peer system is that when Alice wants to pay Bob, what she does is she's going to broadcast the transaction to all of the Bitcoin nodes that comprise the peer-to-peer network. And you can see here the structure of the transaction. This is similar to GoofyCoin, that we saw in the first lecture. And what a transaction is going to have is it's going to have Alice's signature. Which the other nodes need in order to know that it really, in fact, came from Alice. It's going to have Bob's public key, which also accesses his address at which he wants to receive bitcoins. And further, it contains a hash. What is this hash? Recall this notion of hash pointers that we saw in the first lecture. So this hash is a way for Alice to link together this transaction, or this coin to her receipt of this coin from someone else previously, all right? So those are the things that are contained in this data structure that we call a transaction. And she's going to broadcast that to all of the Bitcoin peer-to-peer nodes. And notice something funny here. Bob's computer is nowhere in this picture. Now Bob, if he wants to be notified that this transaction did in fact happen and that he got paid, he might wanna run a Bitcoin node that's one of these peer-to-peer nodes, in order to listen in on the network and be sure that he's received that transaction. But his listening is not, in fact, necessary for him to receive the funds. The bitcoins will be his whether or not he's running a node on the network. So, given this peer-to-peer system, what is it exactly that the nodes might want to reach consensus on? Well, given that a variety of users are broadcasting these transactions to the network. What everybody wants to reach consensus on is exactly which transactions were broadcasted, and the order in which these transactions happened. So what does that mean, specifically? How consensus could work in Bitcoin is that at any given time, all the nodes in the peer-to-peer network would have a sequence of blocks of transactions that they've reached consensus on. So recall that in ScroogeCoin, for optimization purposes, for efficiency, we put transactions into blocks. And we link these blocks together on a block chain. So we're utilizing a similar principle here. We could do consensus on transactions one by one, that would be okay. It would just be inefficient. So instead, we do consensus on a block by block basis. So at any given point, all these nodes in the peer-to-peer network would have the sequence of blocks that they have agreed upon already. And each node would then have a set of outstanding transactions that it has heard about. So recall that for these transactions, consensus has not yet happened. And so, almost by definition, each node might have a slightly different version of the outstanding transactions thats it's heard about. The peer-to-peer network is not perfect, so some node may have heard about a transaction, but not other nodes. So given that we the setup, what could happen is that you have the sequence of blocks that everybody has agreed upon. A block is just a series of transactions. And now there are, let's say, these three nodes in the system, each of whom proposes, each of whom has an input, the set of outstanding transactions that it's heard about. And they execute together some consensus protocol. And for the the consensus protocol to succeed, you can select any valid block, even if it's a block that was proposed by only one node. And for a block to be valid, all of these transactions have to have the right crypto signatures and so on. So you could select any of these valid blocks, and the consensus protocol would still be okay. If some transaction somehow didn't make it into this particular block that gets chosen as the result of the consensus protocol, it could just wait and get into the next block. So maybe this green block gets elected. Now it gets added to the consensus block chain. And then the protocol proceeds and routes. So if you took the traditional theory of distributive consensus and applied that to Bitcoin, this is the sort of system you might end up with. Now, this has some similarities to how Bitcoin works, but it's not exactly how Bitcoin works. And why is that? And the reason for this is simple. Doing things this way is a really hard technical problem, for a variety of reasons. There are some obvious ones. Nodes might crash, and nodes might outright be malicious. But also because the network is highly imperfect. It's a peer-to-peer system. Not all pairs of nodes are connected to each other. There could be faults in the network because of poor Internet connectivity and so on. And finally, there's gonna be a lot of latency in the system, because all of these things happen over the Internet. They're not even within a single data center or something like that. And one particular consequence of this high latency, is that there is no notion of global time. What does this mean, and why is it important? It means that not all nodes can agree to a common ordering of events, simply based on observing timestamps. It just doesn't work like that. So you can't possibly design your protocol by saying things like take the node that sent the first message in step one. And have that node do something in step two. It just can't work like that, because not all the nodes will agree on which message was sent first, in the first step of the protocol. So this really puts serious constraints on what sorts of algorithms you can really put into your consensus protocols. And in fact, because of these constraints, a lot of the literature on distributed consensus is somewhat pessimistic. And many impossibility results have been proved. I'm just gonna name a couple of these, in case you wanna look them up, but I won't go into too much detail. One impossibility result that's very well known and pretty simple to understand is called the Byzantine generals problem. And a much more subtle one, known for the names of the authors who first proved it, is called the Fischer-Lynch-Paterson impossibility result. Under some conditions, which include the nodes acting in a deterministic manner, what they proved is that consensus is impossible, even with a single faulty process. So, despite these impossibility results, there are a few well-known protocols. And Paxos is probably one of the better known. And what Paxos does is it makes certain compromises. What it gives you is that it never produces an inconsistent result, which would be really bad. But it accepts the tradeoff that, under certain conditions, albeit rare ones, the protocol can get stuck and fail to make any progress. But here's the interesting thing. These impossibility results were proved in a different model. They were intended to study distributed databases. And this model doesn't carry over that well to, this was the setting that Bitcoin operates under. So what these results really tell us more about the model than about the problem, in fact. And what Bitcoin does is that it violates a lot of the assumptions that go into these models. And because of that, consensus in Bitcoin, ironically, works better in practice than in theory. And what this really means is that the theory that was developed for a different set of problems needs to catch up, in order to be able to say really interesting things about Bitcoin. But never the less, that theory is quite important because, for example, it can help us predict unforeseen attacks, and really be able to come to strong guarantees on the nature of consensus and security in Bitcoin. So what are these different assumptions? What are some things that Bitcoin does differently? Well, first of all, it introduces the idea of incentives. And this is very different from any previous system for distributed consensus. And this is only possible in Bitcoin because it is a currency. And you can use that currency to give incentives to the participants for acting honestly. And so Bitcoin doesn't quite solve the distributed consensus problem in a general sense. But it solves it in the context of the currency system. The other thing that it does differently is that it really embraces the notion of randomness. And what I mean by that is one of the things it does is, it does away with the notion of a specific starting point and ending point for consensus. Instead, consensus happens over a long period of time, about an hour in the practical system. But even at the end of that time, you're not a 100% sure that a transaction or a block that you're interested in has made it into the consensus block chain. Instead, as time goes on, your probability goes up higher and higher. And the probability that you're wrong in making an assumption about a transaction goes down exponentially. So that's the kind of inherently probabilistic guarantee that Bitcoin gives you. And that's why it's able to completely get around these traditional impossibility results on distributed consensus protocols.