0:00

In the previous segment we saw how to build public key encryption from trapdoor

functions, in this segment we're going to build a classic trapdoor function called

RSA. But first let's quickly review what a trapdoor function is. So recall that a

trapdoor function is made up of three algorithms. There is a key generation

algorithm, the function itself, and the inverse of the function. The key

generation algorithm outputs a public key and a secret key. And in this case, in

this lecture the public key is going to define a function that maps the set X onto

itself. Which is why I call these things trapdoor permutations, as opposed to

trapdoor functions, simply because the function maps X onto itself, whereas a

trapdoor function might map the set X to some arbitrary set Y. Now given the public

key, the function, as we say, basically defines this function from the set X to

the set X. And then given the secret key, we can invert this function. So this

function F evaluates the function in the forward direction, and this function F

inverse, which means the secret key SK, computes F in the reverse direction. Now

we say that the trapdoor permutation is secure if the function defined by the

public key is in fact a one-way function, which means that it's easy to evaluate,

but without the trapdoor, the secret trapdoor, it's difficult to invert. Before

we look at our first example of a trapdoor permutation, I want to do a quick review

of some necessary arithmetic facts that we're gonna need. And in particular,

let's look at some arithmetic facts modulo composites. So here we have our

modulus N, which happens to be a product of two primes, and you should be thinking

of P and Q as roughly equal size numbers, since particular P and Q are both roughly

on the size of square root of N. So both are roughly the same size. Recall that we

denoted by ZN the set of integers from zero to N minus one, and we said that we

can do addition and multiplication module N. We denoted by ZN star the set of

invertible elements in ZN. So these are all the elements, which have a modular

inverse. And we said that actually X is invertible if and only if it's relatively

prime to N. Moreover, I also told you that the number of invertible elements in ZN

star is denoted by this function phi(N). So phi(N) is the number of invertible elements in ZN

star, And I told you that when N is a product of two distinct primes, then in

fact phi(N) is equal to (P - 1) times (Q - 1) and if you extend that out, you

see that this is really equal to (N - P - Q + 1). Now remember that I said

that P and Q are on the order of square root of N which means that P + Q is

also on the order of square root of N. Which means that really phi(N) is on the

order of N minus two square root of N. So, in other words, it's very, very close to

N. Basically, subtracting the square root of N from a number, this is from, N is

going to be a huge number in our case, like 600 digits. And so subtracting from a

600 digit number, the square root of that 600 digit number, namely a 300 digit

number, hardly affects the size of the number. Which means that phi(N) is really,

really, really close to N, and I want you to remember that, because we will actually

be using this now and again. And so the fact that phi(N) is really close to N, means

that if we choose a random element modulo N It's very, very, very likely to be in ZN

star. So it's very unlikely that by choosing a random element in ZN, we will

end up choosing an element that is not invertable. Okay. So just remember that,

that in fact almost all elements in ZN are actually invertible. And the last thing

that we'll need is Euler's theorem, which we covered last week, which basically says

that for any element X in ZN star, if I raise X to the power of phi(N), I get one, in

ZN. So in other words, I get 1 modulo N. I'll say it one more time because this

is gonna be critical to what's coming. Again X to the phi(N) is equal to 1 modulo

N. So now that we have the necessary background we can actually describe the

RSA trapdoor permutation. This is a classic, classic, classic construction in

cryptography that was first published in Scientific American back in 1977, this is

a very well known article in cryptography. And ever since then, this was 35 years

ago, the RSA trapdoor permutation has been used extensively in cryptographic applications.

For example, SSL and TLS use RSA both for certificates and for key exchange. There

are many secure email systems and secure file systems that use RSA to encrypt

emails and encrypt files in the file system. And there are many, many, many

other applications of this system. So this is a very, very classic, crypto

construction, and I'll show you how it works. I should say that RSA is named for

its inventors, Ron Rivest, Adi Shamir and Len Adleman, who were all at MIT at the

time they made this important discovery. So now we're ready to describe the RSA

trapdoor permutation. To do that, I have to describe the key generation algorithm,

the function ƒ and the function ƒ−1. So let's see. So the way the key

generation algorithm works is as follows: What we do is we generate two primes, P

and Q, each would be, say on the order of 1000 bits, so, you know, roughly 300

digits, and then the RSA modulus is simply going to be the product of those two

primes. The next thing we do is we pick two exponents, e and d, such that e times

d is 1 modulo phi(N). What this means is that e and d first of all are relatively

prime to phi(N) and second of all they're actually modular inverses of one another,

modulo phi(N). And then we output the public key as the pair N,e and the

secret key is the secret key N,d. I should mention that the exponent e,

that the number e is sometimes called the encryption exponent and the exponent d is

sometimes called the decryption exponent. And you'll see why I keep referring to

these as exponents in just a second. Now the way the RSA function itself is

defined is really simple. For simplicity I'm gonna define it as the function

from ZN star to ZN star. And the way the function is defined is simply given an

input X, all we do is we simply take X and raise it to the power of e in ZN. So we

just compute X to the e, modulo N. That's it. And then to decrypt, what we do is we

simply, given an input Y, we simply raise Y to the power of d, modulo N. And that's

it. So now you can see why as I refer to e and d as exponents. They're the things

that X and Y are being raised to. So let's quickly verify that F inverse really does

invert the function F. So let's see what happens when we compute Y to the d. So

suppose Y itself happens to be the RSA function of some value X. In which case,

what Y to the d is, is really RSA of X to the power of d. While X by itself is

simply gonna be X to the e modulo N, And therefore, Y to the d is simply X to the e

times d modulo N. Again, just using rules of exponentiation, the exponents multiply,

so we get X to the e times d. But what do we know about e times d? e times d we said

is one modulo phi(N). And what that means is there exists some integer such that e

times d is equal to K times phi(N) plus one. This is what it means that e times d is

one modulo phi(N). So, we can simply replace e times d by K times phi(N)+1. So, that's

what I wrote here. So, we have X to the power of K times phi(N)+1. But now again

just using rules of exponentiation, we can re-write this as X to the power of phi(N) to

the power of K times X. This is really the same thing as K times phi(N)+1 as the

exponent. I just kind of separated the exponent into it's different components.

Now, X to the phi(N) by Euler's theorem, we know that X to the phi(N) is equal to one. So what

is this whole product equal to? Well since X to the phi(N) is equal to one, one to

the K is also equal to one, so this whole thing over here is simply equal to one.

And what we're left with is X. So what we just proved is that if I take the output of

the RSA function and raise it to the power of 'd', I get back X. Which means that

raising to the power of 'd' basically inverts the RSA function, which is what we

wanted to show. So that's it, that's the whole description of the function, we've

described the key generation. The function itself, which is simply raising things to

the power of e modulo N, and the inversion function which is simply raising things to

the power of d, also modulo N. The next question is, why is this function secure?

In other words, why is this function one way if all I have is just a public key,

but I don't have the secret key? And so to argue that this function is one way we

basically state the RSA assumption which basically says that the RSA function is a

one way permutation. And formally the way we state that is that, basically for all

efficient algorithms, A, it so happens that if I generate two primes p and q,

random primes p and q. I multiply them to get to modulus N and then I choose a

random 'y' in ZN star. And now I give the modulus, the exponent and this 'y' to

algorithm A, the probability that I'll get the inverse of RSA at the point Y, mainly

I'll get Y to the power of one over E. That's really what the inverse is. This

probability is negligible. So this assumption is called the RSA assumption.

It basically states that RSA is a one way permutation just given the public [key?]. And

therefore, it is a trapdoor permutation because it has a trapdoor. And makes this

easy to invert for anyone who knows the trap door. So now that we have a secure

trap door permutation, we can simply plug that into our construction for public key

encryption, and get our first real world public key encryption system. And so what

we'll do is we'll simply plug the, the RSA trapdoor permutation into the iso standard

construction that we saw in the previous segment. So, if you recall, that

construction was based on a symmetric encryption system that had to provide

authenticated encryption. And it was also based on a hash function. Then mapped

while transferring it into the world of RSA, it maps elements in

ZN, into secret keys for the symmetric key system. And now the

way the encryption scheme works is really easy to describe. Basically algorithm G

just runs the RSA key generation algorithm and produces a public key and a secret

key. Just as before. So you notice the public key contains the encryption

exponent and the, secret key contains the decryption exponent. And the way we

encrypt is as follows. Well, we're going to choose a random X in ZN. We're going

to apply the RSA function to this X, we're going to deduce a symmetric key from this

X by hashing it, so we compute H of X to obtain the key K, and then we output this

Y along with the encryption of the message under the symmetric key K. And in

practice, the hash function H would be just implemented just using SHA-256, and

you would use the output of SHA-256 to generate a symmetric key that could then

be used for the symmetric encryption assistant. So that's how we would encrypt

and then the way we would decrypt is pretty much as we saw in the previous

segment, where the first thing we would do is we would use the secret key to invert

the header of the ciphertext. So we would compute RSA invert of Y, that would give

us the value X. Then we would apply the hash function to X so that this would give

us the key K. And then we would run the decryption algorithm for the symmetric

system on the ciphertext and that would produce the original message m. And then

we stated a theorem in the previous segment to say that if the RSA trapdoor

permutation is secure, Es and Ds, the symmetric encryption scheme [inaudible]

provides authenticated encryption. And as we said, H is just random oracle. It's,

you know, kind of a random function from ZN to the key space. Then, in fact, this

system is chosen cipher text secure, and is a good public key system to use.

11:36

So now that we have our first example of a good public key system to use, I wanna

quickly warn you about how not to use RSA for encryption. And this again something

that we said in the previous segment. And I'm going to repeat it here, except I'm

going to make it specific to RSA. So I'd like to call this, textbook RSA.

Which basically is the first thing that comes to mind when you think about

encrypting using RSA, namely, the secret key and the public key will be as before.

But now instead of running through, a hash function to generate a symmetric key, what

we would do is we would directly use RSA to encrypt the given message M. And then

we would directly use the decryption exponent to decrypt the cipher text and

obtain the plain text M. I'm going to call this textbook RSA, because there are many

textbooks that describe RSA encryption in this way. And this is completely wrong.

This is not how RSA encryption works. It's an insecure system. In particular,

it's deterministic encryption, and so it can't possibly be semantically secure, but

in fact there are many attacks exist that I'm going to show you an attack in just a

minute, but the message that I want to make clear here, is that RSA, all it is,

is a trap door permutation. By itself it's not an encryption system. You have to

embellish it with this ISO standard for example, to make it into an encryption

system. By itself, all it is, is just a function. So let's see what goes wrong if

you try to use textbook RSA, In other words, if you try to encrypt a message

using RSA directly. And I'll give you an example attack from the world of the web.

So imagine we have a web server. The web server has an RSA secret key. Here's it's

denoted by N and D. And here we have a web browser who's trying to establish a secure

session, a secure SSL session, with the web server. So the way SSL works is that the

web browser starts off by sending this client hello message saying, hey, I want

to set up a secure session with you. The web server responds with a server hello

message that contains the server's public key And then the web browser will go ahead

and generate a random what's called a premaster secret K, it will encrypt the

premaster secret using K and send the result in ciphertext over to the web

server. The web server will decrypt and then the web server will also get K, so

now the two have a shared key that they can use to then secure a session between

them. So I want to show you what goes wrong if we directly use the RSA function

for encrypting K. In other words, if directly K is encrypted as K to the e

modulo N. Just for the sake of argument let's suppose that K is a 64-bit key.

We're going to treat K as an integer in the range as zero to 2 to the 64.

More precisely two to the 64 minus one, and now what we're going to do is the

following. First of all, suppose it so happens that K factors into a product of

roughly equal sized numbers. So K we can write as K1 times K2, where K1 and K2 are

integers. And both are say less than two to the 34. So, it turns out this happens

with probability roughly twenty percent so one in five times K can actually be

written in this way. But, now if we plug this K, K=K1 x K2 if we plug that into the

equation that defines the cipher text you see that we can simply substitute K by K1 x k2

and then we can move k1 to the other side. So then we end up with this equation here,

namely C over K1 to the e is equal to K2 to the e. You notice if I multiply both

sides by K1 to the e, I get that C is equal to K1 times K2 to the e,

which is precisely this equation here. Okay, so all I did is I just replaced K by

K1 times K2 and then divided by K1 to the e. So by now this should look familiar to

you. So now we have two variables in this equation, of course. C is known to the

attacker, E is known to the attacker, and N is known to the attacker. The two

variables in this equation are K1 and K2, and we've separated them into a

different side of the equation, and as a result we can do now a meet in the middle

attack on this equation. So let's do the meet in the middle attack. What we would

do is we would build a table of all possible values Of the left-hand side. So

all possible values of C over K1 to the E, there are 2 to the 34 of them. And then,

for all possible values on the right-hand side, [inaudible] for all possible values

of K2 to the e, we're gonna check whether this value here lives in the table that we

constructed in step one. And if it is then we found a collision, and basically we

have a solution to this equation. So as soon as we find an element of the form K2

to the E that lives in the table that we built in step one, we've solved this

equation and we found K1 and K2. And of course once we found K1 and K2,

we can easily recover K simply by multiplying them. So then we multiply K1

and K2 and we get, the secret key K. Okay? So, we've broken, basically,

this, this encryption system. And how long did it take? Well, by brute force, we

could have broken it in time 2 to the 64, simply by trying all possible keys. But

this attack, you notice, it took 2 to the 34 time for step number one. Well, it

took a little bit more than 2 to the 34, 'cause we had to do these exponentiations.

It took 2 to the 34 time for step two against slightly more than two to the 34

because of the exponentiations. So let's say that the whole algorithm took time two

to the 40. The point is that this is much, much less time due to the 64. So here you

have an example. Where if you encrypt using RSA directly, in other words you

directly compute, K to the E, mod N, instead of going through the ISO standard,

for example, then you get an attack that runs much faster than exhaustive search.

You get an attack that runs in time two to the 40, rather than time two to the 64.

Okay, so this is a cute example of how things can break if you use the RSA

trapdoor permutation directly to encrypt a message. So the message to

remember here, is never, ever, ever use RSA directly to encrypt. You have to use to go

through an encryption mechanism. For example, like the ISO standard. And in

fact, in the next segment we are going to see other ways of using RSA to build

public key encryption.