1 00:00:00,000 --> 00:00:01,497 2 00:00:01,497 --> 00:00:04,990 [MUSIC PLAYING] 3 00:00:04,990 --> 00:00:54,400 4 00:00:54,400 --> 00:00:58,540 So actually, before I start the topic of my talk I want to-- 5 00:00:58,540 --> 00:01:01,810 I'd like to acknowledge two MIT graduates who 6 00:01:01,810 --> 00:01:03,310 are co-authors on some of the work 7 00:01:03,310 --> 00:01:06,640 that Mike Dertouzos mentioned yesterday in the banquet. 8 00:01:06,640 --> 00:01:09,400 One of them is a great PhD student, Joe Kilian-- 9 00:01:09,400 --> 00:01:10,510 still in the lab. 10 00:01:10,510 --> 00:01:12,760 And the other one is Charles Rackoff-- 11 00:01:12,760 --> 00:01:15,700 was joint on the work on zero-knowledge proofs 12 00:01:15,700 --> 00:01:18,160 with Silvio Micali and myself. 13 00:01:18,160 --> 00:01:22,690 So now let's go on to the search for provably 14 00:01:22,690 --> 00:01:25,930 secure cryptosystems. 15 00:01:25,930 --> 00:01:30,070 So cryptography in its simplest form 16 00:01:30,070 --> 00:01:35,020 is simply the problem of exchanging secure messages 17 00:01:35,020 --> 00:01:37,360 over an insecure channel. 18 00:01:37,360 --> 00:01:40,330 Alice wants to send Bob a message so 19 00:01:40,330 --> 00:01:43,930 that the listener on the line has no way to figure out 20 00:01:43,930 --> 00:01:45,820 what the message is. 21 00:01:45,820 --> 00:01:50,170 And traditionally this was achieved by Alice and Bob 22 00:01:50,170 --> 00:01:54,280 meeting a priori and agreeing on some secret code 23 00:01:54,280 --> 00:01:58,090 to use in order to communicate, hoping that the listener will 24 00:01:58,090 --> 00:02:02,540 have no way to figure out what the message is. 25 00:02:02,540 --> 00:02:05,080 And they also traditionally-- it has been somewhat 26 00:02:05,080 --> 00:02:08,169 of an equal fight between the code makers, Alice 27 00:02:08,169 --> 00:02:12,820 and Bob, and the code breaker, who is the listener. 28 00:02:12,820 --> 00:02:15,790 In the sense that all codes were broken 29 00:02:15,790 --> 00:02:20,360 and it was just a matter of time when that will happen. 30 00:02:20,360 --> 00:02:24,040 In fact, when Alice and Bob made up their code 31 00:02:24,040 --> 00:02:27,260 they basically tried to break it themselves. 32 00:02:27,260 --> 00:02:28,840 And when they failed, they decided 33 00:02:28,840 --> 00:02:30,230 that it was a pretty good code. 34 00:02:30,230 --> 00:02:31,550 They're going to use it. 35 00:02:31,550 --> 00:02:35,440 So they had no guarantee that the code that they are using 36 00:02:35,440 --> 00:02:38,632 will actually be secure. 37 00:02:38,632 --> 00:02:40,090 Well, in the last 10 years, there's 38 00:02:40,090 --> 00:02:43,630 been a tremendous progress on the theoretical front 39 00:02:43,630 --> 00:02:45,910 toward actually designing codes which 40 00:02:45,910 --> 00:02:48,430 are provably hard to break. 41 00:02:48,430 --> 00:02:51,970 Where you, Alice and Bob were given the guarantee that no 42 00:02:51,970 --> 00:02:56,230 matter what the listener does he cannot figure out what is 43 00:02:56,230 --> 00:02:58,390 the message sent across. 44 00:02:58,390 --> 00:03:01,210 And this progress has gone really hand-in-hand 45 00:03:01,210 --> 00:03:05,770 with the increasing use of, or need for, 46 00:03:05,770 --> 00:03:07,710 cryptography in the commercial sector. 47 00:03:07,710 --> 00:03:12,520 48 00:03:12,520 --> 00:03:15,303 And what I'll do in this talk basically are two things. 49 00:03:15,303 --> 00:03:16,720 First of all, I'll try to give you 50 00:03:16,720 --> 00:03:20,170 an idea of what is in the heart of these modern cryptographic 51 00:03:20,170 --> 00:03:21,580 systems. 52 00:03:21,580 --> 00:03:23,980 And in the second part, I'll sort of 53 00:03:23,980 --> 00:03:26,320 turn to a more futuristic setting 54 00:03:26,320 --> 00:03:28,960 where we have a distributed network of computers. 55 00:03:28,960 --> 00:03:31,870 And discuss what is the role of cryptography, 56 00:03:31,870 --> 00:03:34,960 in this future setting, with many computers hooked up 57 00:03:34,960 --> 00:03:36,350 together. 58 00:03:36,350 --> 00:03:39,130 But let's begin in the beginning. 59 00:03:39,130 --> 00:03:44,650 And the beginning really starts with a very important paper 60 00:03:44,650 --> 00:03:52,650 by Diffie Hellman in '76, where they put forth the goal. 61 00:03:52,650 --> 00:03:55,060 OK, so they put forth the idea that what 62 00:03:55,060 --> 00:03:57,940 we should be looking for are codes, 63 00:03:57,940 --> 00:04:01,210 such that we can prove that breaking the code 64 00:04:01,210 --> 00:04:05,290 is a hard, impossible, computational problem. 65 00:04:05,290 --> 00:04:07,810 And by impossible, I mean that Alice and Bob 66 00:04:07,810 --> 00:04:10,300 will have the guarantee that as they're communicating 67 00:04:10,300 --> 00:04:14,110 across the line there was just no time, simply not 68 00:04:14,110 --> 00:04:19,089 enough time in the universe for the listener to crack the code. 69 00:04:19,089 --> 00:04:20,920 That's an incredibly ambitious goal 70 00:04:20,920 --> 00:04:23,020 if you went to the talk of Mike Sipser. 71 00:04:23,020 --> 00:04:25,150 Because there are no natural examples, 72 00:04:25,150 --> 00:04:27,040 that we know of, for which we can actually 73 00:04:27,040 --> 00:04:31,900 prove to be that difficult. But Diffie and Hellman do 74 00:04:31,900 --> 00:04:33,740 more than just propose this goal, 75 00:04:33,740 --> 00:04:35,830 they actually give us an idea of how 76 00:04:35,830 --> 00:04:38,590 to proceed toward achieving it. 77 00:04:38,590 --> 00:04:41,260 And this idea is best illustrated 78 00:04:41,260 --> 00:04:46,090 by its first implementation, which 79 00:04:46,090 --> 00:04:50,830 was proposed in this laboratory by Rivest, Shamir, and Adleman, 80 00:04:50,830 --> 00:04:56,410 and is known as the RSA scheme. 81 00:04:56,410 --> 00:05:00,940 By the way, I think that Diffie is also a graduate of MIT. 82 00:05:00,940 --> 00:05:02,470 But the work was done at Stanford. 83 00:05:02,470 --> 00:05:07,960 84 00:05:07,960 --> 00:05:10,360 So this is the idea of Rivest, Shamir, and Adleman, 85 00:05:10,360 --> 00:05:12,940 abbreviated by RSA. 86 00:05:12,940 --> 00:05:15,730 Consider the problem of factoring integers. 87 00:05:15,730 --> 00:05:18,910 So you're given a composite number, n, a large number, much 88 00:05:18,910 --> 00:05:20,710 larger than this, and you're trying 89 00:05:20,710 --> 00:05:23,050 to find out its prime factors. 90 00:05:23,050 --> 00:05:24,625 That is a hard problem. 91 00:05:24,625 --> 00:05:27,250 Again, in spite of what you may have read in The New York Times 92 00:05:27,250 --> 00:05:28,450 recently. 93 00:05:28,450 --> 00:05:34,820 And what RSA says is the following, 94 00:05:34,820 --> 00:05:36,530 take this function f of x. 95 00:05:36,530 --> 00:05:38,180 What does this function do to x? 96 00:05:38,180 --> 00:05:42,260 It takes x to the power alpha mod n. 97 00:05:42,260 --> 00:05:44,330 So regardless of what this really means 98 00:05:44,330 --> 00:05:46,560 it has the following three properties. 99 00:05:46,560 --> 00:05:51,980 First of all, given x, it's very easy to compute f of x. 100 00:05:51,980 --> 00:05:55,670 Second of all, given f of x, it is extremely hard 101 00:05:55,670 --> 00:05:59,930 computationally to find out what x is. 102 00:05:59,930 --> 00:06:04,250 And third, if you happen to know some extra information, 103 00:06:04,250 --> 00:06:07,010 if you happen to know the factors of n, 104 00:06:07,010 --> 00:06:10,190 then it is extremely easy to figure out what x is, 105 00:06:10,190 --> 00:06:12,620 starting from f of x. 106 00:06:12,620 --> 00:06:15,020 These three properties is what gives this function 107 00:06:15,020 --> 00:06:18,740 its name, which is a one-way trapdoor function. 108 00:06:18,740 --> 00:06:21,740 One-way, because it's easy in one way, 109 00:06:21,740 --> 00:06:23,390 but hard the other way. 110 00:06:23,390 --> 00:06:25,880 And trapdoor, because if you happen 111 00:06:25,880 --> 00:06:28,520 to know some extra trapdoor information, 112 00:06:28,520 --> 00:06:32,810 the function is actually easy to invert. 113 00:06:32,810 --> 00:06:35,850 Now, what does this have to do with secure communication? 114 00:06:35,850 --> 00:06:39,530 Well, it has everything to do with it. 115 00:06:39,530 --> 00:06:42,260 In particular, what Alice and Bob can now do 116 00:06:42,260 --> 00:06:43,437 is the following. 117 00:06:43,437 --> 00:06:47,550 118 00:06:47,550 --> 00:06:51,440 If Alice now wants to send Bob a secure message, all he does 119 00:06:51,440 --> 00:06:52,410 is the following. 120 00:06:52,410 --> 00:06:55,310 He takes f and computes f of the message 121 00:06:55,310 --> 00:06:58,280 and sends it to Bob across the line. 122 00:06:58,280 --> 00:07:02,180 She can do that because that's an easy transformation. 123 00:07:02,180 --> 00:07:04,160 Now, the listener, who's trying to figure out 124 00:07:04,160 --> 00:07:07,250 what the message is, would like to invert f. 125 00:07:07,250 --> 00:07:11,180 But he cannot do that since that's a hard transformation. 126 00:07:11,180 --> 00:07:13,910 And finally, Bob, the legal recipient, 127 00:07:13,910 --> 00:07:16,640 should be able to figure out what the message is. 128 00:07:16,640 --> 00:07:21,320 So we will give him the extra trapdoor information, 129 00:07:21,320 --> 00:07:23,090 so he will be able to invert the function 130 00:07:23,090 --> 00:07:25,880 and figure out what the message is. 131 00:07:25,880 --> 00:07:28,850 So this is really an incredibly beautiful idea 132 00:07:28,850 --> 00:07:31,280 which manages to break the symmetry between the code 133 00:07:31,280 --> 00:07:35,330 makers, Alice and Bob, and the listener. 134 00:07:35,330 --> 00:07:37,820 Alice and Bob can perform an easy computation 135 00:07:37,820 --> 00:07:39,890 and communicate securely. 136 00:07:39,890 --> 00:07:43,220 The listener is confronted with an impossible computation 137 00:07:43,220 --> 00:07:45,260 and is not able to break the system. 138 00:07:45,260 --> 00:07:48,410 139 00:07:48,410 --> 00:07:51,360 Now, let's recall our goal. 140 00:07:51,360 --> 00:07:54,590 Our goal was to come up with codes where we can actually 141 00:07:54,590 --> 00:07:57,560 prove mathematically that breaking the code 142 00:07:57,560 --> 00:08:00,950 is as hard as solving a computationally 143 00:08:00,950 --> 00:08:05,520 infeasible problem, let's say, like factoring integers. 144 00:08:05,520 --> 00:08:07,580 So if we sit down to actually prove this 145 00:08:07,580 --> 00:08:11,552 we have to sort of sit back and think. 146 00:08:11,552 --> 00:08:13,010 All right, so we want to prove that 147 00:08:13,010 --> 00:08:15,680 what does it really mean that it's 148 00:08:15,680 --> 00:08:18,050 impossible to break the code? 149 00:08:18,050 --> 00:08:21,200 OK, we have not defined yet precisely what 150 00:08:21,200 --> 00:08:26,090 it means for the listener to break the code. 151 00:08:26,090 --> 00:08:31,040 And here's one possible meaning that you can give. 152 00:08:31,040 --> 00:08:33,610 153 00:08:33,610 --> 00:08:36,340 In fact, it's extremely strong, meaning 154 00:08:36,340 --> 00:08:38,770 suppose we could really do what we wanted. 155 00:08:38,770 --> 00:08:41,470 What we would want, would be for an encryption algorithm 156 00:08:41,470 --> 00:08:44,620 to be the same as taking x and putting it inside 157 00:08:44,620 --> 00:08:48,500 of an opaque envelope, and sending it across the line. 158 00:08:48,500 --> 00:08:50,740 So that the listener now, looking at the line, 159 00:08:50,740 --> 00:08:55,240 just sees an envelope and has no idea of what the contents are. 160 00:08:55,240 --> 00:08:57,250 So if this is really what we're shooting for, 161 00:08:57,250 --> 00:08:59,980 is it what we are getting? 162 00:08:59,980 --> 00:09:01,690 The answer is not quite. 163 00:09:01,690 --> 00:09:04,120 Because if you look at a function applied to an x 164 00:09:04,120 --> 00:09:08,320 It's an algebraic function or a function. 165 00:09:08,320 --> 00:09:10,900 Then it is not quite as good as putting 166 00:09:10,900 --> 00:09:12,430 x inside of an envelope. 167 00:09:12,430 --> 00:09:15,025 For example, suppose x was just 0, 168 00:09:15,025 --> 00:09:19,390 or 1 It turns out that by looking at f of 0, 169 00:09:19,390 --> 00:09:21,790 you can tell that this is f of 0. 170 00:09:21,790 --> 00:09:26,890 Or, by looking at f of 1, you can tell that this is f of 1. 171 00:09:26,890 --> 00:09:31,030 Another example is that f of x doesn't hide all information 172 00:09:31,030 --> 00:09:34,840 or partial information about x that you may want to hide. 173 00:09:34,840 --> 00:09:37,090 The same way that you do when you 174 00:09:37,090 --> 00:09:39,610 put x inside of an envelope. 175 00:09:39,610 --> 00:09:43,110 A third possibility is that you would want to make sure that 176 00:09:43,110 --> 00:09:45,610 the listener cannot detect that you've sent the same message 177 00:09:45,610 --> 00:09:46,300 twice. 178 00:09:46,300 --> 00:09:48,550 Or the relations between messages 179 00:09:48,550 --> 00:09:49,810 that you have been sending. 180 00:09:49,810 --> 00:09:51,580 That will be captured by envelopes, 181 00:09:51,580 --> 00:09:53,800 but it will not be captured by f of x. 182 00:09:53,800 --> 00:09:58,890 Because f of x, sent twice, is always the same. 183 00:09:58,890 --> 00:10:01,390 So you may be thinking, well, I'm setting my goals too high. 184 00:10:01,390 --> 00:10:03,580 We cannot achieve envelopes. 185 00:10:03,580 --> 00:10:05,110 But the answer is actually that you 186 00:10:05,110 --> 00:10:08,950 can achieve envelopes and let me give you an idea how. 187 00:10:08,950 --> 00:10:12,400 But again, before we show how we have to define mathematically, 188 00:10:12,400 --> 00:10:14,050 what do we mean by an envelope? 189 00:10:14,050 --> 00:10:18,380 We can't just say that we mean a physical envelope. 190 00:10:18,380 --> 00:10:19,955 And this is the definition. 191 00:10:19,955 --> 00:10:26,140 192 00:10:26,140 --> 00:10:28,260 Our definition is as follows. 193 00:10:28,260 --> 00:10:32,920 We'll say that a cryptosystem is secure if, for all message 194 00:10:32,920 --> 00:10:36,460 pairs m1 and 2, the blue message and the red message, 195 00:10:36,460 --> 00:10:40,540 which you may think of for convenient is 0 and 1, 196 00:10:40,540 --> 00:10:44,380 the listener cannot distinguish between an encryption of m1, 197 00:10:44,380 --> 00:10:48,040 an encryption of m2, better than 50-50. 198 00:10:48,040 --> 00:10:51,370 In other words, take m1 and encrypt it. 199 00:10:51,370 --> 00:10:53,590 Take m2 and encrypt it. 200 00:10:53,590 --> 00:10:56,200 Put it in those two encryptions, in front of the listener, 201 00:10:56,200 --> 00:10:59,230 and ask him to guess which is which. 202 00:10:59,230 --> 00:11:01,060 What we want is that for the listener 203 00:11:01,060 --> 00:11:03,770 to have no better chance than just flip a coin and say, 204 00:11:03,770 --> 00:11:05,650 well, I think this must be m1. 205 00:11:05,650 --> 00:11:07,540 Or, I think this must be m2. 206 00:11:07,540 --> 00:11:09,410 Notice that that is exactly what happens. 207 00:11:09,410 --> 00:11:12,250 If you put m1 in an envelope and m2 in an envelope, 208 00:11:12,250 --> 00:11:15,610 scramble them up, and put them in front of the listener. 209 00:11:15,610 --> 00:11:18,400 He will have no idea where m1 is. 210 00:11:18,400 --> 00:11:22,840 The best he can do is guess at random. 211 00:11:22,840 --> 00:11:24,880 And in particular, this definition, 212 00:11:24,880 --> 00:11:28,060 if you can find a cryptosystem satisfying it, 213 00:11:28,060 --> 00:11:31,310 will solve all the problems of the previous slide. 214 00:11:31,310 --> 00:11:34,300 So no partial information will leak. 215 00:11:34,300 --> 00:11:37,810 You cannot detect the two messages are sent, 216 00:11:37,810 --> 00:11:42,130 the same messages sent twice and you can send all possible 217 00:11:42,130 --> 00:11:45,620 messages this way. 218 00:11:45,620 --> 00:11:48,450 Note also, that I've changed the notation on this slide. 219 00:11:48,450 --> 00:11:50,330 So here I'm using "E", for envelope, 220 00:11:50,330 --> 00:11:54,290 or for encryption, while before we're using a function, "f". 221 00:11:54,290 --> 00:11:56,120 And the reason is that there really 222 00:11:56,120 --> 00:12:00,170 is no way that we can achieve this sort of security 223 00:12:00,170 --> 00:12:04,790 using one single function, one deterministic function. 224 00:12:04,790 --> 00:12:07,070 The simplest way to see that is that, if we 225 00:12:07,070 --> 00:12:09,320 send the same message twice, using f, 226 00:12:09,320 --> 00:12:12,050 you will tell that the same message is sent twice. 227 00:12:12,050 --> 00:12:14,510 And that is something that I want to outrule 228 00:12:14,510 --> 00:12:16,350 with this definition. 229 00:12:16,350 --> 00:12:19,817 So this gives you an idea of where the solution and what 230 00:12:19,817 --> 00:12:20,900 the solution will be like. 231 00:12:20,900 --> 00:12:24,600 232 00:12:24,600 --> 00:12:29,040 The solution is probabilistic encryption by Micali 233 00:12:29,040 --> 00:12:30,750 and myself in '82. 234 00:12:30,750 --> 00:12:33,300 And if the solution has the following property, 235 00:12:33,300 --> 00:12:37,830 that every message is going to have many possible encryptions. 236 00:12:37,830 --> 00:12:39,330 So it's not going to be true anymore 237 00:12:39,330 --> 00:12:41,220 that when we send the same message twice, 238 00:12:41,220 --> 00:12:44,790 we necessarily are sending the same encryption. 239 00:12:44,790 --> 00:12:47,520 So every m will have many possible encryptions 240 00:12:47,520 --> 00:12:50,920 associated with it. 241 00:12:50,920 --> 00:12:53,850 But we will still keep the trapdoor function flavor 242 00:12:53,850 --> 00:12:57,330 so that it's going to be still hard for the listener, 243 00:12:57,330 --> 00:12:59,070 starting with one of these encryptions, 244 00:12:59,070 --> 00:13:01,330 to figure out the original message. 245 00:13:01,330 --> 00:13:04,140 And finally, if you knew some extra information, 246 00:13:04,140 --> 00:13:06,540 if you did know the trap door, then you 247 00:13:06,540 --> 00:13:09,330 can figure out what the message is. 248 00:13:09,330 --> 00:13:13,080 So suppose we could construct such a primitive. 249 00:13:13,080 --> 00:13:16,560 Then all Alice has to do is to pick one of these encodings 250 00:13:16,560 --> 00:13:18,750 at random-- 251 00:13:18,750 --> 00:13:21,720 and at random here is very important-- 252 00:13:21,720 --> 00:13:24,450 and to send this encoding across to Bob. 253 00:13:24,450 --> 00:13:28,320 Bob figures out what the message is since he has the trap door, 254 00:13:28,320 --> 00:13:32,340 and the listener is stuck. 255 00:13:32,340 --> 00:13:35,340 And notice here that in fact, if we send the same message twice 256 00:13:35,340 --> 00:13:37,680 here, then what's the probability that Alice will 257 00:13:37,680 --> 00:13:39,900 pick the same encoding twice? 258 00:13:39,900 --> 00:13:42,300 Well, if there's lots of encodings to choose from that 259 00:13:42,300 --> 00:13:43,210 will be very small. 260 00:13:43,210 --> 00:13:45,850 261 00:13:45,850 --> 00:13:49,590 So it's a fine goal and it is achievable. 262 00:13:49,590 --> 00:13:53,770 And now, I will get to the most technical slide in my talk, 263 00:13:53,770 --> 00:13:59,290 actually trying to give you an idea of how it is achievable. 264 00:13:59,290 --> 00:14:00,730 So bear with me. 265 00:14:00,730 --> 00:14:03,160 It will get better. 266 00:14:03,160 --> 00:14:05,580 I hope. 267 00:14:05,580 --> 00:14:09,970 All right, so forget sending arbitrary messages. 268 00:14:09,970 --> 00:14:12,690 Suppose that all we want to send is either a 0 or a 1, 269 00:14:12,690 --> 00:14:14,500 how do we do that? 270 00:14:14,500 --> 00:14:19,650 Well, I pointed out earlier that if you apply f to x, 271 00:14:19,650 --> 00:14:22,860 and f is this trapdoor function, then f doesn't necessarily 272 00:14:22,860 --> 00:14:26,220 hide all information about x. 273 00:14:26,220 --> 00:14:28,710 But what you should observe is that f must 274 00:14:28,710 --> 00:14:30,900 hide some information about x. 275 00:14:30,900 --> 00:14:32,670 So if f really is trap door, there 276 00:14:32,670 --> 00:14:35,790 must be something that it's hiding about x. 277 00:14:35,790 --> 00:14:39,630 And in particular, there was a single bit of information 278 00:14:39,630 --> 00:14:42,180 about x that f is hiding. 279 00:14:42,180 --> 00:14:44,520 So for example, to make this more concrete 280 00:14:44,520 --> 00:14:47,130 for the case of the RSA function, 281 00:14:47,130 --> 00:14:49,530 if you just look at f of x, one can 282 00:14:49,530 --> 00:14:51,390 prove that there's no way that you 283 00:14:51,390 --> 00:14:55,500 can tell whether x is an even number, or x is an odd number. 284 00:14:55,500 --> 00:15:00,060 So f of x hides very well, whether x is even, or x is odd. 285 00:15:00,060 --> 00:15:03,110 286 00:15:03,110 --> 00:15:05,720 In the risk of being redundant, basically, what we 287 00:15:05,720 --> 00:15:07,370 realize is the following picture. 288 00:15:07,370 --> 00:15:13,010 That the listener, when he looks at the purple f of x's-- 289 00:15:13,010 --> 00:15:15,770 which are x's applied to even numbers-- 290 00:15:15,770 --> 00:15:18,320 and we looked at the green f of x's-- 291 00:15:18,320 --> 00:15:20,450 x is applied to odd numbers-- 292 00:15:20,450 --> 00:15:23,150 has no way to tell these two spaces apart. 293 00:15:23,150 --> 00:15:26,360 To him they look identical. 294 00:15:26,360 --> 00:15:29,390 And once that is real-- and once we realize that, 295 00:15:29,390 --> 00:15:31,400 our encryption scheme follows. 296 00:15:31,400 --> 00:15:34,010 Now, in order to send an encryption of a zero, 297 00:15:34,010 --> 00:15:40,430 all we do is pick a purple f of x, and send it to Bob. 298 00:15:40,430 --> 00:15:42,650 And if we wanted to send a one, all we 299 00:15:42,650 --> 00:15:46,610 do is pick a green f of x, and send it to Bob. 300 00:15:46,610 --> 00:15:49,350 Now Bob is able to invert this function 301 00:15:49,350 --> 00:15:51,260 so he can tell whether our x is purple, 302 00:15:51,260 --> 00:15:52,910 or whether our x is green. 303 00:15:52,910 --> 00:15:56,270 Namely whether our x is even, or whether our x is odd. 304 00:15:56,270 --> 00:15:59,990 And the listener has no way to distinguish between those even 305 00:15:59,990 --> 00:16:03,095 purple x's and the green odd x's. 306 00:16:03,095 --> 00:16:06,390 307 00:16:06,390 --> 00:16:11,410 Our slide is completed and you must be thinking two things. 308 00:16:11,410 --> 00:16:14,880 First of all, you want to send more than just zeros and ones. 309 00:16:14,880 --> 00:16:17,850 And to that my answer is, OK, write your message 310 00:16:17,850 --> 00:16:22,412 as a sequence of zeros and ones and send each bit individually. 311 00:16:22,412 --> 00:16:24,120 And the second thing you must be thinking 312 00:16:24,120 --> 00:16:27,450 is, wow, this might be nice theoretically, 313 00:16:27,450 --> 00:16:29,740 but it's extremely expensive. 314 00:16:29,740 --> 00:16:32,520 So for every bit you've got to do a tremendous amount of work 315 00:16:32,520 --> 00:16:36,370 and send a tremendous amount of information across the line. 316 00:16:36,370 --> 00:16:39,000 So, you know, don't be of little faith. 317 00:16:39,000 --> 00:16:42,360 Basically, with more theory and more ideas 318 00:16:42,360 --> 00:16:46,200 you can actually get the same security level proposed 319 00:16:46,200 --> 00:16:49,020 by this scheme, and do it as efficiently 320 00:16:49,020 --> 00:16:53,640 as the original trapdoor function schemes-- 321 00:16:53,640 --> 00:16:56,458 almost as efficiently. 322 00:16:56,458 --> 00:16:57,000 Pretty close. 323 00:16:57,000 --> 00:17:01,410 324 00:17:01,410 --> 00:17:06,790 OK, so just to summarize this, if we put all this together, 325 00:17:06,790 --> 00:17:09,810 Diffie and Hellman, and RSA, and probablistic encryption, 326 00:17:09,810 --> 00:17:13,470 and ideas by Blum, Blum, and Shub, what we can show now 327 00:17:13,470 --> 00:17:17,550 is a cryptosystem, which is secure, if and only 328 00:17:17,550 --> 00:17:22,079 if, factoring in integers is a hard problem. 329 00:17:22,079 --> 00:17:25,829 That is, we've managed to achieve our original goal that 330 00:17:25,829 --> 00:17:28,050 making a code, where breaking the code, 331 00:17:28,050 --> 00:17:32,070 is as hard as solving a hard computational problem. 332 00:17:32,070 --> 00:17:35,230 Notice that this theorem is basically a double edged sword. 333 00:17:35,230 --> 00:17:38,880 Either you've realized an incredibly secure cryptosystem. 334 00:17:38,880 --> 00:17:43,050 Or, the crypto analyst has managed to factor integers, 335 00:17:43,050 --> 00:17:45,360 has managed to solve a problem, which has been defying 336 00:17:45,360 --> 00:17:47,650 mathematicians for centuries. 337 00:17:47,650 --> 00:17:50,760 And that's a risk you may want to take. 338 00:17:50,760 --> 00:17:54,570 339 00:17:54,570 --> 00:17:57,270 OK, so this wraps up the story of encryption, 340 00:17:57,270 --> 00:17:59,520 but it does not wrap up my talk. 341 00:17:59,520 --> 00:18:01,710 And it actually doesn't wrap up the story 342 00:18:01,710 --> 00:18:02,850 of modern cryptography. 343 00:18:02,850 --> 00:18:05,790 There's a lot more that we can do 344 00:18:05,790 --> 00:18:14,220 and I will only mention one other result in cryptography. 345 00:18:14,220 --> 00:18:17,130 Basically, we're going to go from the most basic problem 346 00:18:17,130 --> 00:18:21,180 to the current issues that people are working on. 347 00:18:21,180 --> 00:18:24,690 348 00:18:24,690 --> 00:18:26,190 So we're going to leave behind Alice 349 00:18:26,190 --> 00:18:28,320 and Bob and their communication problems. 350 00:18:28,320 --> 00:18:31,890 And we're going to go into a distributed environment where 351 00:18:31,890 --> 00:18:34,020 we don't only have Alice and Bob. 352 00:18:34,020 --> 00:18:38,070 But we have a whole bunch of processors hooked up together, 353 00:18:38,070 --> 00:18:40,710 in some arbitrary fashion. 354 00:18:40,710 --> 00:18:46,650 And each has some, let's say, some local state 355 00:18:46,650 --> 00:18:49,170 which I'll just denote by this processors 356 00:18:49,170 --> 00:18:53,940 has x1, x2, x3, x4, xn. 357 00:18:53,940 --> 00:18:56,700 And what they'd like to do is not only 358 00:18:56,700 --> 00:18:59,200 communicate in pairs in privacy. 359 00:18:59,200 --> 00:19:03,450 But what they'd like to do is to compute some global computation 360 00:19:03,450 --> 00:19:06,670 with respect to these private inputs. 361 00:19:06,670 --> 00:19:09,510 So generally they want to compute a function, "g", 362 00:19:09,510 --> 00:19:11,490 defined on their local states. 363 00:19:11,490 --> 00:19:13,460 This computation should be correct. 364 00:19:13,460 --> 00:19:15,660 OK, so they want to be guaranteed that in the end 365 00:19:15,660 --> 00:19:19,410 they have the right result. And it should be private. 366 00:19:19,410 --> 00:19:21,120 What do we mean by private? 367 00:19:21,120 --> 00:19:23,490 That in the end, after the computation is completed 368 00:19:23,490 --> 00:19:24,990 and they have received the outcome-- 369 00:19:24,990 --> 00:19:26,880 receive the outcome of the function-- 370 00:19:26,880 --> 00:19:29,850 we want to give them the guarantee that other processors 371 00:19:29,850 --> 00:19:32,520 don't find out more than the function gives them 372 00:19:32,520 --> 00:19:35,830 about the local state. 373 00:19:35,830 --> 00:19:45,200 And to make life even worse, some of these processors 374 00:19:45,200 --> 00:19:47,760 maybe faulty. 375 00:19:47,760 --> 00:19:50,210 So they could either be benign faults, like-- 376 00:19:50,210 --> 00:19:52,805 their eyes aren't aligned here-- 377 00:19:52,805 --> 00:19:54,680 OK, they can either be benign faults, 378 00:19:54,680 --> 00:19:55,940 that is they just crash. 379 00:19:55,940 --> 00:19:58,760 Or we could consider the worst type of faults, 380 00:19:58,760 --> 00:20:01,850 they be-- they may maliciously be cooperating 381 00:20:01,850 --> 00:20:04,610 to sabotage a computation, and to find out 382 00:20:04,610 --> 00:20:08,420 information about each other. 383 00:20:08,420 --> 00:20:14,000 What is an example of such a computational goal? 384 00:20:14,000 --> 00:20:17,842 I mean, are there examples or is this just a paranoiac setting? 385 00:20:17,842 --> 00:20:20,300 I mean, we are cryptographers, so we are somewhat paranoid. 386 00:20:20,300 --> 00:20:27,270 387 00:20:27,270 --> 00:20:28,660 So what are some examples? 388 00:20:28,660 --> 00:20:30,270 So one example is that this network 389 00:20:30,270 --> 00:20:32,610 would like to elect a president, let's say. 390 00:20:32,610 --> 00:20:34,680 So you think about each one of these processors 391 00:20:34,680 --> 00:20:36,360 as having a vote. 392 00:20:36,360 --> 00:20:39,778 And they would like to compute the tally. 393 00:20:39,778 --> 00:20:41,820 They want two properties out of this computation. 394 00:20:41,820 --> 00:20:44,640 First they want to be able to verify that a tally was 395 00:20:44,640 --> 00:20:46,680 computed correctly. 396 00:20:46,680 --> 00:20:48,900 And second, they want to make sure 397 00:20:48,900 --> 00:20:51,990 that their vote has kept private. 398 00:20:51,990 --> 00:20:53,760 And perhaps, even third, they would 399 00:20:53,760 --> 00:20:57,870 like to ensure that the vote of one processor 400 00:20:57,870 --> 00:21:01,423 was selected independently of the vote of another processor. 401 00:21:01,423 --> 00:21:03,840 This is one example where it's fairly clear that you would 402 00:21:03,840 --> 00:21:05,830 like to get the correct answer. 403 00:21:05,830 --> 00:21:10,020 And you want to keep your private state private. 404 00:21:10,020 --> 00:21:11,500 What's another example? 405 00:21:11,500 --> 00:21:16,020 So another example is, suppose that these processors are 406 00:21:16,020 --> 00:21:18,870 all databases containing lists of names. 407 00:21:18,870 --> 00:21:24,390 408 00:21:24,390 --> 00:21:27,320 These are the employees of the CIA, NSA, FBI, 409 00:21:27,320 --> 00:21:30,344 DARPA, and Project MAC. 410 00:21:30,344 --> 00:21:32,398 And they would like to find out who is employed 411 00:21:32,398 --> 00:21:33,440 by all of these agencies. 412 00:21:33,440 --> 00:21:37,850 413 00:21:37,850 --> 00:21:40,460 But none of them, except Project MAC, of course, 414 00:21:40,460 --> 00:21:42,040 would like to expose their employees. 415 00:21:42,040 --> 00:21:44,600 416 00:21:44,600 --> 00:21:45,520 So how can they do it? 417 00:21:45,520 --> 00:21:46,870 This is a global computation. 418 00:21:46,870 --> 00:21:49,210 They would like to find out who is in all the lists 419 00:21:49,210 --> 00:21:53,530 without telling who is in any of the list. 420 00:21:53,530 --> 00:21:55,870 Something like that. 421 00:21:55,870 --> 00:21:58,942 Good, so how can they achieve this? 422 00:21:58,942 --> 00:22:01,900 Or, if we wanted to be archaic, they could just send everything 423 00:22:01,900 --> 00:22:04,720 to a trusted center and he would compute that for them 424 00:22:04,720 --> 00:22:06,220 and give them the results. 425 00:22:06,220 --> 00:22:08,840 But that's really going back to the centralized world. 426 00:22:08,840 --> 00:22:10,840 And, not only that, as we said, we are paranoid. 427 00:22:10,840 --> 00:22:12,040 We do not trust the center. 428 00:22:12,040 --> 00:22:15,280 429 00:22:15,280 --> 00:22:17,490 So what we can do is actually achieve 430 00:22:17,490 --> 00:22:21,765 this in a distributed manner and the theorem is as follows. 431 00:22:21,765 --> 00:22:24,290 432 00:22:24,290 --> 00:22:28,170 If the processers can compute securely between themselves 433 00:22:28,170 --> 00:22:33,480 in pairs, and we're guaranteed that less than a third of them 434 00:22:33,480 --> 00:22:37,710 collaborate maliciously to sabotage a computation, 435 00:22:37,710 --> 00:22:40,260 then give me any function you like, 436 00:22:40,260 --> 00:22:42,660 let's say, like voting, or database intersection, 437 00:22:42,660 --> 00:22:44,820 or whatever you may be interested in. 438 00:22:44,820 --> 00:22:46,860 Just give me the input/output specification 439 00:22:46,860 --> 00:22:49,540 of the function and we can give you-- 440 00:22:49,540 --> 00:22:51,300 or it can be given-- 441 00:22:51,300 --> 00:22:56,100 a protocol for these impostors to follow 442 00:22:56,100 --> 00:23:00,210 so as to compute the function, both correctly and privately. 443 00:23:00,210 --> 00:23:02,350 I haven't defined what a protocol means. 444 00:23:02,350 --> 00:23:06,120 But you can keep in mind following intuitive definition. 445 00:23:06,120 --> 00:23:08,550 We will tell processor, "i", what to tell processor, 446 00:23:08,550 --> 00:23:13,080 "j", at time, "t", so that in the end of this protocol 447 00:23:13,080 --> 00:23:15,180 the function is actually being realized. 448 00:23:15,180 --> 00:23:18,180 And we have a guarantee that it's been realized correctly 449 00:23:18,180 --> 00:23:22,230 while keeping privacy intact. 450 00:23:22,230 --> 00:23:25,320 And that is really the end. 451 00:23:25,320 --> 00:23:28,310 452 00:23:28,310 --> 00:23:30,510 A natural question is, what's for the future? 453 00:23:30,510 --> 00:23:34,910 And the answer is that we've actually 454 00:23:34,910 --> 00:23:39,080 have an incredibly developed theory of cryptography now. 455 00:23:39,080 --> 00:23:44,150 And it's all developed, really based on, these very simple, 456 00:23:44,150 --> 00:23:47,150 nice assumptions that there exist trapdoor functions. 457 00:23:47,150 --> 00:23:49,448 And it is the most interesting goal right now 458 00:23:49,448 --> 00:23:51,740 is to actually figure out whether these assumptions are 459 00:23:51,740 --> 00:23:54,950 minimal assumptions or can you somehow shrink them 460 00:23:54,950 --> 00:23:59,960 down to the point where we can prove that they hold. 461 00:23:59,960 --> 00:24:02,060 Thank you. 462 00:24:02,060 --> 00:24:04,210 [MUSIC PLAYING] 463 00:24:04,210 --> 00:24:29,000