Some terminology, plaintext, often denoted by the letter P sometimes called clear-text, is what you want to protect or what you want to encrypt. Ciphertext, often denoted by C, is what you get after encryption. So we're going to look at a number of encryption techniques, this one is called the Caesar Cipher, named after Julius Caesar. It's a simple substitution cipher based on some kind of a shift so, A taken the English alphabet, A maps to X this is offset by three, B maps to Y, C maps to Z, D maps to A. That one is pretty easy to crack, don't ever use that in a product that you design. This one is called the one time pad, and it's the so-called perfect encryption algorithm, perfect encryption scheme. It is very difficult to or impractical to solve or break this algorithm. It's just non practical for real-world implementation. So, I'm going to explain how this works. So, 26 letters in the alphabet so they're all assigned a number, A is equal to zero, B is equal to one, C is equal to two, and the letter Z is equal to 25, so there's 26 letters. The modulus is 26 characters. So, someone takes their plain text and that's, M-E-E-T T-O-N-I-G-H-T, meet tonight is the message that they want to communicate to the person at the other end. So, the sender takes the plain text and writes down the number for each of the letters, so E is the fourth letter and I'm using this numbering scheme here, writes all those num- translates all those characters. I had a time they mutually decided on a key consisting of letters, this key could be anything like they previously agreed upon, could use the 365 days in the year, so you could take the day number, and that could be a page of a book, and they both have a copy of this book. So, on the first day of the year they use text for a string of characters for the key on from page one and day 200, they go to page 200 and they take the characters at the start at the top of the page for their key, so every day has a different key. I just made this one up, a key is D, Z, H, S, U, I, M, W, E, K, C and you write down the number associated with the letter position there as well, so Z is 25 et cetera all the way across. Then you take the plain text and add it to the key to produce the sum, so we get this 15, 29, 11, 37 et cetera all the way across just adding up the two. Then you take the modulus the 20- mod 26 of that and you end up with the remainder, so you see 39 mod 26 is 3, 37 mod 26 is 11, and you end up with the remainders of this calculation. So you end up with the remainders here. These will be in the range of zero to 25, and then you convert those back into their letter equivalent, so, the ciphertext becomes P, D, L, L, N, W, Z, E, K, R, V and this is what is sent to the receiver, and the receiver just does the reverse calculation. Takes the ciphertext, turns it back into the numbers we saw up here, has the key ahead of time already, so, does a subtraction so 15 minus three is 12, and does this for all of the- each of the individual characters. So, when you see a negative, just think about it just wraps back around and becomes four, minus seven becomes 19, minus 19 is just 14, minus 18 becomes eight, and then you can take the numbers for the letters and turn it back into the original plain text. So if you ever really want to communicate with someone securely, this is a great way to do it. Assuming no one understands what you're using for the key. Jumping ahead, what's used today to my knowledge all cryptographic systems always is AES, the Advanced Encryption Standard. It's established by the United States National Institutes of Standards. It's what's called a block cipher. So, 16-bytes of the data that you want to encrypt go into the encryption engine and there's an iterative process that takes place and you get 16-bytes out, and then the next 16-bytes goes in and the algorithm runs and 16-bytes comes out, and that process continues until you get to the end of your message. Supports three key length, 128-bits, 192-bits, and 256-bits in length. Very high level description of the process is, there's this notion of rounds in this algorithm, an N equals the number of rounds here. So, we start with an encryption key that it's either 128, 192-bits, or 256-bits, and then number of rounds is equal to 10, 12, or 14, so think of a four loop for if you're 128-bits would be for N equals zero to nine, or if it's a 256-bit key, it would be four N equals zero to 13, so you go around this loop 14 times. So, the number of rounds of calculation that take place because a function of the key length and it's either 10 or 12 or 14. So, that starts out with round zero and then it rounds one to N minus two and then it ends up with running the very final round. So this maybe a surprising statement, this algorithm is believed to be secure but we the general public and academic community, we don't know how to prove it, can't prove that it's secure, we just believe it to be secure. So, here's a picture of the AES encryption operation, so you start out with your cipher key or encryption key. The first thing that happens in the AES algorithm is this key, whether it's 128-bits or all the way up to 256-bits, is expanded into round keys for however many rounds there are. So, you take your key and what happens is this key expansion, and then you get a unique key for each round. So, whatever key you pick here this was AES 256, you'd get 14 unique round keys. You take your plain text 16-bytes comes into the AES core and takes the first round key and does an encryption cycle, and then it takes that data and use its round key, next round key round key one, and then round key two, and round key three, and it keeps going around cranking on the data, munging up all of the bits until all of the round keys have been used and then it spits out 16-bytes of ciphertext. Decryption works exactly the same way, got 16-bytes of ciphertext goes into the AES core logic, that same key expansion process takes place generating the same round keys, you read the specification that tells you explicitly how to run them in great detail, how to take a cipher key and create all the rounds and how to perform each iteration in detail, it's all in the specification. So cranks through its rounds and produces 16-bytes of plain text and then again, 16-bytes of ciphertexts comes in, cranks through its rounds, plain text comes out. When the AES algorithm is implemented as per the specification, it is known as Electronic Code Book. I'm not sure why, I'm not sure where that term came from, that's what everyone knows it as, Electronic Code Book. Sixteen bytes in cranked around 16-bytes out. So, shouldn't that be good enough? That seem like a pretty secure, pretty reasonable approach. Can you think of any issues that might arise by taking16-bytes in and cranking them and taking 16-bytes out? So here's an example, and one of the issues with electronic code book, ECB is that the same plain text always encrypts to the same ciphertext, result is, it leaks information. Why does it leak information? What does that leaking look like? There's a fabulous picture out on Wikipedia about block mode encryption, and I have snatched this image in multiple training sessions over the years because it really drives home the point. So here's our Linux penguin picture, you feed that into the AES algorithm, doesn't matter what key you'd use. Again key one, again key length key one, doesn't matter. Same plaintexts always encrypts to the same ciphertext. What you get, this is a bitmap. Look at that. It's leaking information. Okay, it doesn't have the fidelity of the original, but we can kind of see what it is, if we're familiar with this image and then we counter this and can look at the raw data and say well, it's the Linux penguin. Point is that it leaks information. So, cryptographers got on there right away and cooked up a solution for this. They said, all right that same ciphertexts encrypting to the same, or same plain text encrypting to the same ciphertext is, that's a poor choice, it's poor implementation, we can do better. They created what's called cipher block chaining mode known as CBC. So what happens here, this shows 316-byte chunks of data being encrypted. So 16 bytes of plaintext comes in here, it's very far at the beginning of the chain, and there's 16 bytes of initialization vector that gets exclusive odd with the plaintext. That then is fed into the block cipher encryption standard AES algorithm and we get the ciphertext out and press the key, key goes in and the round keys get generated and anyway. So 16 bytes in, gets xored with initialization vector, key goes in, does it's rounds, it spits out the ciphertext, this ciphertext then seeds the next 16 bytes of data coming in, and serves as the initialization vector, IV stands for initialization vector and it's xored with the plaintext and this process continues for each 16-bytes. This is a picture of the deciphering, I start again with the initialization, a ciphertext comes in and with the key, and then gets decrypted and then the initialization vector is xored with the plain text to get back, and then this ciphertext which was used in the xored going into encryption, in the encryption step now comes in down here and this process repeats to undo it. Though, we're back with our picture, and we have AES CBC mode and what do we get? A much better result. It does not leak information. These initialization vectors, these values, they should be chosen wisely. What does it mean to be chosen wisely? Ideally, each 16-byte block would have a unique and unpredictable IV. You can use a counter, people have used the counter, the counter is predictable, you want to try to think up something that's unpredictable, so CBC mode specification doesn't say what the IV can or can't be, that's one area you can get creative as a design engineer, decide what value you want to use for the IV, but if you can make an unpredictable, it makes it that much harder, it's just one more gate that an attacker, an adversary has to figure out. If it's a counter, counters can be figured out pretty easy, they will try it, they'll start at zero or one or whatever and then we'll give it a whirl and see if they get anywhere with that. If it's an unpredictable IV, it makes it tougher for them. AES XTS mode and I can't recall, XTS is an acronym of acronyms and I'd have to go look up what it stands for. So, NIST added XTS mode for storage devices. It's specified in these SP800-documents, in this case, it's dash 38E. Later, we'll go take a look at all the SP800 documents published by NIST and there's quite a few of them that address a number of range of issues. It's deemed better than Cipher block chaining. When I was at Seagate we did ECB mode and we did CBC mode and we did XTS mode. We had a lot of conversations about since, the conversation went like, "Well, since XTS is deemed better than CBC, shouldn't we take CBC out of the design? Why would you use CBC if you've got XTS mode available to you?" I don't know what happened, it's probable by this point in time that they removed CBC mode from their hardware encryption engine, but I don't know that that's just a guess on my part. One of the reasons why you'd want to take it out is it presented timing problems, there was issues with the implementation and closing timing when CBC mode was in there, so,that's just flip-flop through a bunch of combinational logic to a flip-flop, and they were created these long paths, and so, it always a pain for timing closure when we were taping chips out and get ready to build these things, and we thought, "We get rid of CBC, we got XTS mode is better, and we'll get rid of all these CBC timing problems", but I'm not sure exactly what happened. Everybody that I'm aware of today in storage does XTS mode. The AES was called a symmetric encryption algorithm because it uses the same key for encryption and the decryption steps. There's another kind of encryption called asymmetric encryption. It's also known as public key encryption or public key cryptography. It uses a pair of keys, one public and one private. It provides two functions, provides encryption, it also provides authentication. Verifies that the message came from the holder of the matching private key or public key depending on which one gets exchanged. So the way you can think about how these keys get generated is you start with some large random number, and that feeds into the key generation process. Generating these keys in this step, one public and one private is computationally easy is computationally inexpensive. However, it is computationally difficult or impractical to compute the private key given the public key. Public key is a key that's handed out, anyone can get access to it generally, and systems, we'll see later for TLS and how web browsers establish secure communication channels, the public key is public anyone can go get it, anyone can go look at it. Private keys needs to be kept private. It's very difficult to go this way. Notice I didn't say it's impossible. If you hear me say that, make an absolute statement in the next two-week, point it out, alright? Because I'm in violation of my own thought process, my own mindset. I try very hard not to talk in terms of absolutes when we're discussing security. So how does it work? So, Alice wants to receive secure communication from Bob. So, this box in the center represents some kind of insecure channels, some public communication channel, some nature means. So Alice generates a public private key pair, Alice keeps her private key private and sends Bob for public keys, so Bob's got the public key over here. So Bob will take that public key and type message, "Hi Alice", feed it into an encryption algorithm, so D is the data for the encryption algorithm and K is the key that's used, and I use this nomenclature and all these pictures and all these diagrams coming up. So the message goes in and the public key goes in, it gets encrypted, and some gobbledygook goes across into this insecure channel. Alice receives that message, feeds it into the decryptor using her private key, and is able to extract the original message. Does Alice know that the message really came from Bob? What if Eve snatched Alice's public key when it was being sent? Eve was sitting off in the side here, "Oh, I'm going to grab that key and hang onto it, and I can masquerade as Bob", and Alice would have no idea, she thinks she sent key only to Bob, but now Bob and Eve had the key. So, it's a risk. You flip this around, and well, does it get any better? So, Alice here still wants to receive secure communication from Bob, so Bob can generate a public and private key pair and send Alice through Bob's public key, and do exactly the same thing, take the message, Bob uses private key to encrypt it, comes across the insecure channel, comes over into the decrypter, Alice uses Bob's public key to decrypt it to get the message. Has exactly the same problem. Eve could have snatched the key, so it's something to be aware of.