[SOUND]. >> As we have seen, any encryption scheme with a keyspace that is too small will be vulnerable to an exhaustive search attack in which an attacker tries to decrypt the cipher text using every possible key. But it's important to understand that just because a scheme has a large keyspace, that doesn't mean it's secure. To see a good example of this, let's look at an extension of the shift cypher that has a much larger key space. In the Vigenere cypher, the key is now a string rather than just a single character. To encrypt a message, we now shift each character in the plain text just as in the shift cypher. But by an amount given by the next character of the key, wrapping around in the key is necessary. So for example, if we encrypt the plain text tell him about me, using the key cafe, then we shift the first letter of the plain text by the amount corresponding to the first letter of the key, or C. That is by two positions. We shift the second letter of a plain text by the amount corresponding to the second character of the key, or A. That is, by zero positions, and so on. Since our key is only four characters long, when we reach the fifth character of the plain text, we wrap around and use the first character of the key to encrypt that character, and, so on. Decryption just reverses this process in the obvious way. What's the size of the key space in this scheme? Well, if we assume that keys are 14 characters long, then the number of possible keys is 26 to the 14th power. Since there are 26 possibilities for each of the 14 characters. In base 2, this is about 2 to the 66, which is a large number. In a later lecture, we'll do some calculations to show exactly how big this is. But for now, just take my word that the key space here is large enough to pretty much rule out Brute-force attacks. Does this mean the Vigenere cipher is secure? Well, it was thought to be secure for many years, precisely because the key space was so large. In the end, however, the Vigenère cypher is not secure. The reason is that a Brute-force exhaustive search attack is not the only attack possible. And by being clever, we can attack the scheme in other ways. Here, I'm going to walk through a very basic attack assuming for simplicity that the key is always exactly 14 characters long. The main observation is that if you look at a cipher text, then every 14th character in that cipher text has been shifted by the same amount. Or in other words, if you just focus on every 14th character of the cipher text, it is just as if you have a cipher text that was output by the shift cipher. For example, take this cipher text here output by Vigenere cipher. I bolded every 14th character, starting from the first, and each of those characters was obtained by shifting the underlying plain text in those positions by the same amount. Similarly, if we look at every 14th character starting from the third position, each of those characters was obtained by applying the same shift to the underlying plain text in those positions. To be clear, the shift for the bolded characters and the underlined characters may be different from each other. But the shift for all the bolded characters is identical. And the shift for all the underlined character is identical. So what we've said is that by looking at every fourteenth character in the cypher text, we get something that was generated just like a ciphertext encrypted by the shift cipher. We know we can do a Brute-force attack on the shift cipher, so are we done? Well, not quite. Remember that the brute-force attack on the shift cipher work by trying to decrypt the ciphertext using every possible key, and looking at the list of 26 possible plain texts for one plain text that makes sense. Here, however, that's not going to work. The reason is that even when we guess the correct shift, what we get is every 14th character of the original plain text, which will look like gibberish even when you get the guess of the shift correctly. So we're going to have to work a little bit harder here. One thing we can do instead is to rely on plaintext letter frequencies. Here's a table showing letter frequencies for normal English text. You can see that E, for example, is the most common letter occurring about 13% of the time in English text. At the other end of the scale, and not too surprising for speakers of English, we see that the letters Z and Q occur only about 0.1% of the time. Letter frequencies depend on the language, of course, but can be found or obtained for just about any language in existence. So, how can we use letter frequencies to attack the Vigenere cipher. Well, a simple way to use them is as follows. Look at every 14th character of the ciphertext starting with the first. Let alpha denote the most commonly occurring letter in that portion of the ciphertext. We're going to guess that the ciphertext character corresponds to the most common plain text letter E. Assuming that's the case, the first character of the key must be equal to the shift needed to go from the plain text character E to the cipher text character alpha. That is the shift must be alpha minus E. We can repeat this process for every 14th character starting from the second position, starting from the third position, etc., until we learn all the characters of a key. Now I should point out that this attack is not perfect. It requires a long ciphertext, long enough so that taking every 14th character in the ciphertext. There’s enough characters to get representative statistics for the underlying plain text frequencies. And even man they can make a mistake and get some characters of the key wrong. However, this attack will work given a long enough cipher text. Moreover, better attacks on the Vigenere cipher are known. I didn't describe those here because the purpose of this exercise is not really to study the Vigenere cipher itself, but just to show that a little bit of cleverness can sometimes give an attack on an encryption scheme that does much better than exhaustive search. So, what have we seen so far? Well, we've seen two examples of encryption schemes. The shift cipher and the Vigenere cipher. Both of which could ultimately be broken. The second example, the Vigenere cipher, actually took a little bit of cleverness to break and may have looked secure at first glance. Well, how do we know in general whether any encryption scheme we come up with, no matter how good it looks, is actually secure? Wouldn't it be nice if we could prove security of some encryption scheme? So we could be sure that no attacks were possible? Before we can hope to do that, however, we first have to define precisely what we mean by security here. There are actually two distinct reasons for this. The first is simply that if we want to have rigorous mathematical proof of security, then we're going to need a rigorous, precise definition of security in place first. But it goes beyond that. Even at an informal level, it's important to have a clear idea in mind of what security means. Until now, I've pretty much waved my hand and said something about an attacker not learning any information about the plain text. But what I have in mind when I say that may not be what you have in mind when you hear it. Even when I showed attacks on the schemes, I made a bunch of assumptions that you might find questionable. For example, in the attacks on the shift in Vigenère ciphers, I implicitly assumed that what was being encrypted was English text. But what if the parties were encrypting Spanish or, or, or transliterated Hebrew? What if I don't even know what language they're using? What if I know they're using English but they were encrypting C code rather than regular English language text? The attacks I gave won't work in this case. But does that mean that schemes are secure for those applications? All of this discussion just highlights the importance of having a formal definition in place. And this is something we'll turn to in the next few lectures.