1
00:00:00,000 --> 00:00:52,050


2
00:00:52,050 --> 00:00:54,090
Good morning, ladies
and gentlemen.

3
00:00:54,090 --> 00:00:57,690
I'd like, today, to speak
about zero-knowledge.

4
00:00:57,690 --> 00:01:01,980
Or, I'd like to speak about
nothing but the truth.

5
00:01:01,980 --> 00:01:04,860


6
00:01:04,860 --> 00:01:08,660
So let me ask the question
that you may already

7
00:01:08,660 --> 00:01:14,830
wondering about, namely,
what is zero-knowledge?

8
00:01:14,830 --> 00:01:20,330
Well, zero-knowledge is
a new and powerful theory

9
00:01:20,330 --> 00:01:25,040
of both correctness and
privacy of information

10
00:01:25,040 --> 00:01:28,970
with big impact both from a
theoretical point of view,

11
00:01:28,970 --> 00:01:32,120
and a practical point of view.

12
00:01:32,120 --> 00:01:35,240
And this theory has been
developed quite quickly

13
00:01:35,240 --> 00:01:37,610
in recent years.

14
00:01:37,610 --> 00:01:40,685
And now has already
a very rich history.

15
00:01:40,685 --> 00:01:45,880


16
00:01:45,880 --> 00:01:53,830
These are, according to me,
really the papers that made it.

17
00:01:53,830 --> 00:01:59,200
And if you knew already that
MIT was a center of knowledge,

18
00:01:59,200 --> 00:02:01,510
you'd be happy to
know that MIT is also

19
00:02:01,510 --> 00:02:03,243
center of zero knowledge.

20
00:02:03,243 --> 00:02:08,449


21
00:02:08,449 --> 00:02:13,280
Not only the starred people
here all belong to our lab,

22
00:02:13,280 --> 00:02:16,240
but also most of the other
ones were associated to our lab

23
00:02:16,240 --> 00:02:17,240
at one point or another.

24
00:02:17,240 --> 00:02:19,970


25
00:02:19,970 --> 00:02:30,260
Well, we were saying correctness
and privacy of information.

26
00:02:30,260 --> 00:02:35,420
Now traditionally, checking the
correctness of some information

27
00:02:35,420 --> 00:02:38,030
is the domain of proofs.

28
00:02:38,030 --> 00:02:41,270
So the first thing that
we are going to ask today

29
00:02:41,270 --> 00:02:44,390
is, what is a proof?

30
00:02:44,390 --> 00:02:48,380
And once we have answered
this big question

31
00:02:48,380 --> 00:02:50,970
with a sufficient
level of generality,

32
00:02:50,970 --> 00:02:54,560
we will proceed to tackle
more ambitious questions.

33
00:02:54,560 --> 00:03:00,260
Such as, how much knowledge
should the proof contain?

34
00:03:00,260 --> 00:03:02,640
Let me start with
the first question.

35
00:03:02,640 --> 00:03:07,140


36
00:03:07,140 --> 00:03:10,320
What is a proof?

37
00:03:10,320 --> 00:03:14,220
No matter how you put it,
a pool is an interaction

38
00:03:14,220 --> 00:03:16,140
between two parties.

39
00:03:16,140 --> 00:03:22,950
One party better works hard,
and we'll call him the prover.

40
00:03:22,950 --> 00:03:26,340
And the other party
better be lazy.

41
00:03:26,340 --> 00:03:27,675
And we call him the verifier.

42
00:03:27,675 --> 00:03:30,990


43
00:03:30,990 --> 00:03:35,640
Should the time necessary
to find the proof of theorem

44
00:03:35,640 --> 00:03:39,420
roughly coincide with the
time necessary to check

45
00:03:39,420 --> 00:03:40,950
whether the proof is correct?

46
00:03:40,950 --> 00:03:43,440
Then all mathematicians
would be out of jobs.

47
00:03:43,440 --> 00:03:46,300


48
00:03:46,300 --> 00:03:47,820
So that is what a proof is.

49
00:03:47,820 --> 00:03:53,640
And let me give you a simple
example, which technically,

50
00:03:53,640 --> 00:03:56,220
within a framework
called NP on which

51
00:03:56,220 --> 00:03:59,380
Mike Sipser spoke about today.

52
00:03:59,380 --> 00:04:01,960
So here we have our input
statement the theorem

53
00:04:01,960 --> 00:04:04,140
we would like to prove.

54
00:04:04,140 --> 00:04:09,060
We have two graphs, A and B.
And they are the same graphs,

55
00:04:09,060 --> 00:04:12,780
except they are
drawn differently,

56
00:04:12,780 --> 00:04:15,750
and they changed the
names of their nodes.

57
00:04:15,750 --> 00:04:16,769
One has numbers.

58
00:04:16,769 --> 00:04:17,640
One has letters.

59
00:04:17,640 --> 00:04:20,329


60
00:04:20,329 --> 00:04:22,730
Mathematically, when
this is the case,

61
00:04:22,730 --> 00:04:25,910
you just say that A and
B are isomorphic graphs.

62
00:04:25,910 --> 00:04:30,080
The same graph but
drawn differently.

63
00:04:30,080 --> 00:04:34,835
Now, in this case,
it is immediate

64
00:04:34,835 --> 00:04:36,760
that these two
graphs are the same.

65
00:04:36,760 --> 00:04:40,870
But should this graph have
had thousands of nodes,

66
00:04:40,870 --> 00:04:43,240
we wouldn't be
able to distinguish

67
00:04:43,240 --> 00:04:45,200
or to detect this fact at all.

68
00:04:45,200 --> 00:04:49,660
Therefore, this is a
plausible hard theorem.

69
00:04:49,660 --> 00:04:52,360
But the prover can't convince
us that the two graphs are

70
00:04:52,360 --> 00:04:55,450
the same in this simple case,
like in a very complicated

71
00:04:55,450 --> 00:05:00,190
case, by just sending over
the correspondence, a 1 to 1

72
00:05:00,190 --> 00:05:02,740
correspondence between
the vertices of A

73
00:05:02,740 --> 00:05:06,640
and the vertices of B
that preserve the edges.

74
00:05:06,640 --> 00:05:12,880
That is to say, A over here,
really is vertex 5 over there.

75
00:05:12,880 --> 00:05:15,790
B is free, and so on.

76
00:05:15,790 --> 00:05:18,880
And the verifier
has a very easy job.

77
00:05:18,880 --> 00:05:24,970
Only to detect that each vertex,
each edge in the first graph,

78
00:05:24,970 --> 00:05:26,680
is also an edge in
the second graph.

79
00:05:26,680 --> 00:05:32,890
For instance, 1, 2 is an edge
in A, But 1 is d, and 2 is c.

80
00:05:32,890 --> 00:05:36,250
So vertice d be an edge
of it in the second graph.

81
00:05:36,250 --> 00:05:38,350
And indeed, there it is.

82
00:05:38,350 --> 00:05:41,800
And also should check with vise
versa, but every edge over here

83
00:05:41,800 --> 00:05:44,400
is an edge over there.

84
00:05:44,400 --> 00:05:47,740
And you can see this may be
important, because if these

85
00:05:47,740 --> 00:05:50,740
are the graphs of
two large molecules,

86
00:05:50,740 --> 00:05:52,900
and one has been discovered
in Europe and one

87
00:05:52,900 --> 00:05:54,430
in the United
States, and they want

88
00:05:54,430 --> 00:05:57,095
to know if the two
molecules are the same.

89
00:05:57,095 --> 00:05:59,470
Each researcher constructs
the molecule the way he wants,

90
00:05:59,470 --> 00:06:03,910
and cannot by eye
inspection find that out.

91
00:06:03,910 --> 00:06:07,760
But once that you look at this,
which is a specific example,

92
00:06:07,760 --> 00:06:12,130
but is a canonical way in
which proofs are given.

93
00:06:12,130 --> 00:06:14,410
There is a statement and
somebody writes the proof,

94
00:06:14,410 --> 00:06:17,620
publishes in a book, namely, in
our case, The Correspondence,

95
00:06:17,620 --> 00:06:20,470
and then the verifier
reads the paper.

96
00:06:20,470 --> 00:06:24,190
It will be immediate that
this way of proving things

97
00:06:24,190 --> 00:06:28,930
betrays much more knowledge
than we are asking for.

98
00:06:28,930 --> 00:06:33,280
Here we are asking about
the single bit question,

99
00:06:33,280 --> 00:06:36,160
are A and B the same graph?

100
00:06:36,160 --> 00:06:37,930
Yes or no?

101
00:06:37,930 --> 00:06:41,620
But the only way for a prover to
publish it on a piece of paper

102
00:06:41,620 --> 00:06:45,580
with proof, is to give away
the entire correspondence.

103
00:06:45,580 --> 00:06:47,990
Not only the single
bit we are asking for,

104
00:06:47,990 --> 00:06:50,050
but the entire thing.

105
00:06:50,050 --> 00:06:52,960
And that is intuitively
the case with this.

106
00:06:52,960 --> 00:06:57,340
Proofs betray much more
knowledge than necessary,

107
00:06:57,340 --> 00:06:59,770
perhaps.

108
00:06:59,770 --> 00:07:03,880
But this raises a
very good question.

109
00:07:03,880 --> 00:07:07,450
Namely, what is knowledge?

110
00:07:07,450 --> 00:07:10,990
And that raises even
a better question.

111
00:07:10,990 --> 00:07:12,325
Is there life--

112
00:07:12,325 --> 00:07:13,810
[LAUGHING]

113
00:07:13,810 --> 00:07:15,300


114
00:07:15,300 --> 00:07:20,130
So today, we want to
answer this question,

115
00:07:20,130 --> 00:07:23,370
but from a computer
science point of view.

116
00:07:23,370 --> 00:07:25,254
[LAUGHING]

117
00:07:25,254 --> 00:07:26,200


118
00:07:26,200 --> 00:07:29,820
So let me to refine our
intuition, [INAUDIBLE] all

119
00:07:29,820 --> 00:07:32,910
can be made [INAUDIBLE]
given more time.

120
00:07:32,910 --> 00:07:36,840
Let me tell you what knowledge
is by just giving what I'd

121
00:07:36,840 --> 00:07:39,150
like to call the Knowledge Act.

122
00:07:39,150 --> 00:07:45,550
The Knowledge Act is composed
if you want, by 2 axioms.

123
00:07:45,550 --> 00:07:50,760
First axiom says, efficient
computation is for free.

124
00:07:50,760 --> 00:07:52,180
What does that mean?

125
00:07:52,180 --> 00:07:54,180
Let me tell you in a day--

126
00:07:54,180 --> 00:07:55,560
everyday example.

127
00:07:55,560 --> 00:07:58,410
You are walking on the
street with an integer

128
00:07:58,410 --> 00:08:02,310
on your left hand and an
integer on your right hand.

129
00:08:02,310 --> 00:08:04,330
And all of a sudden you
meet some guy who says,

130
00:08:04,330 --> 00:08:07,110
hey, hey you with
a two integers.

131
00:08:07,110 --> 00:08:11,970
$1,000 and I will sell you the
product of these two integers.

132
00:08:11,970 --> 00:08:14,430
Then I don't know what you
are going to answer him.

133
00:08:14,430 --> 00:08:17,520
But what I will say
is you must be crazy,

134
00:08:17,520 --> 00:08:19,200
because multiplication is easy.

135
00:08:19,200 --> 00:08:22,650
I know how to multiply and
adjust and I'll keep the $1,000

136
00:08:22,650 --> 00:08:25,800
and get the product by myself.

137
00:08:25,800 --> 00:08:29,520
So we don't pay for
efficient computation.

138
00:08:29,520 --> 00:08:33,960
The second axiom, random
strings for free too.

139
00:08:33,960 --> 00:08:35,159
What does that mean?

140
00:08:35,159 --> 00:08:37,860
You usually, before you
are walking on the street

141
00:08:37,860 --> 00:08:40,049
and some other crazy-- the
same crazy guy stops you

142
00:08:40,049 --> 00:08:43,049
again and says,
hey $1,000 and I'm

143
00:08:43,049 --> 00:08:46,960
going to give you 100
random bit string.

144
00:08:46,960 --> 00:08:48,570
He says, look.

145
00:08:48,570 --> 00:08:49,470
You must be crazy.

146
00:08:49,470 --> 00:08:51,610
I have a coin the
same way that you do.

147
00:08:51,610 --> 00:08:53,490
And I can flip coins
as well as you do.

148
00:08:53,490 --> 00:08:58,980
So I flip 100 times my
coin and I save $1,000.

149
00:08:58,980 --> 00:09:04,170
So in such a case, what does
convey knowledge instead?

150
00:09:04,170 --> 00:09:06,780
Well, let's assume
now that you are again

151
00:09:06,780 --> 00:09:09,960
walking on the street with
a graph on your left hand

152
00:09:09,960 --> 00:09:12,180
and a graph on your right hand.

153
00:09:12,180 --> 00:09:14,700
And the same guys
asks you, George, you

154
00:09:14,700 --> 00:09:16,680
are having to isomorphic graphs.

155
00:09:16,680 --> 00:09:21,300
$1,000 and I sell you
the correspondence.

156
00:09:21,300 --> 00:09:23,070
Then you bargain for price.

157
00:09:23,070 --> 00:09:31,110


158
00:09:31,110 --> 00:09:36,930
Because essentially, that
information has knowledge.

159
00:09:36,930 --> 00:09:38,160
And what is knowledge?

160
00:09:38,160 --> 00:09:40,200
What has a chance
to convey knowledge

161
00:09:40,200 --> 00:09:43,710
is the output of an
feasible computation.

162
00:09:43,710 --> 00:09:48,120
A computation that would require
me too long, because the chance

163
00:09:48,120 --> 00:09:49,560
of conveying knowledge.

164
00:09:49,560 --> 00:09:51,810
But once you define
knowledge of a sway,

165
00:09:51,810 --> 00:09:54,120
you immediately see
the traditional proofs

166
00:09:54,120 --> 00:09:55,530
is an overkill.

167
00:09:55,530 --> 00:09:57,270
It's not only the
problem of the graph

168
00:09:57,270 --> 00:10:00,630
I saw my face on problem like we
saw before, is always the case.

169
00:10:00,630 --> 00:10:02,680
For the proofs that
you publish regularly,

170
00:10:02,680 --> 00:10:07,470
the proof-- the good old proofs,
they convey too much knowledge

171
00:10:07,470 --> 00:10:08,280
when they need.

172
00:10:08,280 --> 00:10:12,190
It's like shooting at
the fly with a gun.

173
00:10:12,190 --> 00:10:15,990
Now, given this is a problem,
a traditional proofs,

174
00:10:15,990 --> 00:10:17,730
well, let's see.

175
00:10:17,730 --> 00:10:21,840
These proofs are
a mental construct

176
00:10:21,840 --> 00:10:26,040
roughly 2,500 years ago, which
started in ancient Greece.

177
00:10:26,040 --> 00:10:29,070


178
00:10:29,070 --> 00:10:34,560
And we have been living
very happily in the house

179
00:10:34,560 --> 00:10:38,080
with Euclid and
others built for us,

180
00:10:38,080 --> 00:10:41,580
except that in recent years,
this house is becoming

181
00:10:41,580 --> 00:10:43,560
a little bit small for us.

182
00:10:43,560 --> 00:10:46,710
So when you tackle
questions about knowledge,

183
00:10:46,710 --> 00:10:49,350
we don't quite fit
in there anymore.

184
00:10:49,350 --> 00:10:54,000
And we want to
build a new house,

185
00:10:54,000 --> 00:10:57,040
retaining a wave of the
flavor of the old one.

186
00:10:57,040 --> 00:11:01,020
So we don't want any more to
deal with traditional proofs,

187
00:11:01,020 --> 00:11:04,230
but a different type
of object, which

188
00:11:04,230 --> 00:11:05,550
I'm going to present right now.

189
00:11:05,550 --> 00:11:07,170
And hopefully
you'll be convinced

190
00:11:07,170 --> 00:11:12,690
of it is enough for the same
thing but we can be happy.

191
00:11:12,690 --> 00:11:17,100
So I'd like
therefore, to give you

192
00:11:17,100 --> 00:11:22,950
a flavor of these new proofs
via an abridged example.

193
00:11:22,950 --> 00:11:26,820
And you should notice that
these new proofs differ

194
00:11:26,820 --> 00:11:29,160
from the classical
notion of a proof in two

195
00:11:29,160 --> 00:11:30,820
important respects.

196
00:11:30,820 --> 00:11:33,180
First, they are probabilistic.

197
00:11:33,180 --> 00:11:38,140
Prover and verifier flip coins
doing the proofing process.

198
00:11:38,140 --> 00:11:40,530
Second, they are instructed.

199
00:11:40,530 --> 00:11:43,320
They cannot be
published in a book,

200
00:11:43,320 --> 00:11:47,670
but the prover and the verifier
will talk back and forth,

201
00:11:47,670 --> 00:11:51,615
very much like in
classroom teaching.

202
00:11:51,615 --> 00:11:52,990
Let's have a look
at the example.

203
00:11:52,990 --> 00:11:55,520


204
00:11:55,520 --> 00:12:00,320
Here is the proof of
graph isomorphism.

205
00:12:00,320 --> 00:12:03,050
Input, we have two
graphs, A and B.

206
00:12:03,050 --> 00:12:05,990
And one way as we said to
prove A and B on the same graph

207
00:12:05,990 --> 00:12:10,430
is ship over to the
verifier, the correspondence.

208
00:12:10,430 --> 00:12:12,320
The prover and the
verifier instead

209
00:12:12,320 --> 00:12:17,780
will be engaging in a somewhat
more complicated game.

210
00:12:17,780 --> 00:12:23,060
They are going to repeat K times
the following instructions.

211
00:12:23,060 --> 00:12:30,210
First, the prover flips coins
to generate C, a fared graph,

212
00:12:30,210 --> 00:12:34,940
which is a scramble version
of A and B. So C is a graph

213
00:12:34,940 --> 00:12:37,850
and is isomorphic to
A and isomorphic to B,

214
00:12:37,850 --> 00:12:40,650
because A and B are isomorphic
between what each other.

215
00:12:40,650 --> 00:12:43,250
The only point is
really to prove it.

216
00:12:43,250 --> 00:12:46,160
So takes this random
graph isomorphic to both,

217
00:12:46,160 --> 00:12:47,120
and sends it over.

218
00:12:47,120 --> 00:12:49,830


219
00:12:49,830 --> 00:12:54,300
Claiming C is isomorphic
to both A and B.

220
00:12:54,300 --> 00:12:58,110
Now, that the verifier doesn't
want to listen to anything.

221
00:12:58,110 --> 00:13:02,550
He flips a coin to decide
at random between A and B.

222
00:13:02,550 --> 00:13:04,860
And let us assume
he has the going

223
00:13:04,860 --> 00:13:11,910
has come up A. Therefore he will
send A request to the prover.

224
00:13:11,910 --> 00:13:14,850
And the prover will
satisfy this request

225
00:13:14,850 --> 00:13:20,670
by sending pi, a correspondence
between the nodes of A

226
00:13:20,670 --> 00:13:23,400
and the nodes of
C with Cervantes,

227
00:13:23,400 --> 00:13:29,130
thereby proving with A and
C aren't missing graph.

228
00:13:29,130 --> 00:13:31,440
And verify it given the
correspondence, he says,

229
00:13:31,440 --> 00:13:32,100
that's true.

230
00:13:32,100 --> 00:13:33,570
A and C are the same graph.

231
00:13:33,570 --> 00:13:35,430
He says OK.

232
00:13:35,430 --> 00:13:40,170
So much fun, we'll do it
over and over again, K times.

233
00:13:40,170 --> 00:13:43,980
And even verifier said
OK K times in a row,

234
00:13:43,980 --> 00:13:47,010
then they will say
fine, I'm convinced,

235
00:13:47,010 --> 00:13:49,110
A and B are the same graph.

236
00:13:49,110 --> 00:13:52,210


237
00:13:52,210 --> 00:13:53,650
Why is this a proof?

238
00:13:53,650 --> 00:13:56,500
That's an interaction,
but way of proof?

239
00:13:56,500 --> 00:14:00,310
Well, first of all, let me tell
you that if the film is true,

240
00:14:00,310 --> 00:14:03,250
the verify and
will be convinced.

241
00:14:03,250 --> 00:14:05,320
It goes the prover
does what I said,

242
00:14:05,320 --> 00:14:07,120
but verifier does
what I said and they

243
00:14:07,120 --> 00:14:09,670
will be convinced that at the
time, all of it at the end

244
00:14:09,670 --> 00:14:12,500
all the time.

245
00:14:12,500 --> 00:14:14,140
Let's now argue
over whether this

246
00:14:14,140 --> 00:14:16,510
is a proof by
contradiction as we

247
00:14:16,510 --> 00:14:18,910
prove almost anything anyway.

248
00:14:18,910 --> 00:14:22,240
So let us assume that
the film is false.

249
00:14:22,240 --> 00:14:24,510
Can we convince somebody
over a false film?

250
00:14:24,510 --> 00:14:28,730


251
00:14:28,730 --> 00:14:32,240
So let's assume now with A
and B are different graphs

252
00:14:32,240 --> 00:14:35,270
but we assume you
can figure it out.

253
00:14:35,270 --> 00:14:38,720
Then they claim that no
matter how the prover choose

254
00:14:38,720 --> 00:14:44,510
a C, if a C is not the same
as A or C is not the same as C

255
00:14:44,510 --> 00:14:47,270
or is neither.

256
00:14:47,270 --> 00:14:50,810
So when C is shipped
over no matter

257
00:14:50,810 --> 00:14:54,980
how it's chosen, when re-verify
a flips a coin between A and B,

258
00:14:54,980 --> 00:14:58,490
he has at least
a chance one half

259
00:14:58,490 --> 00:15:01,490
of catching the
prover in the wrong.

260
00:15:01,490 --> 00:15:06,380
Because even prover construct
C by scrambling A say,

261
00:15:06,380 --> 00:15:09,560
and even verifier asks
for A, he can ship over

262
00:15:09,560 --> 00:15:11,870
the correspondence
between A and C.

263
00:15:11,870 --> 00:15:15,860
But if a prover constructed
C by scrambling B,

264
00:15:15,860 --> 00:15:20,910
and the verifier asks for A,
he has nothing to send here.

265
00:15:20,910 --> 00:15:22,790
So then the verifier
would say no,

266
00:15:22,790 --> 00:15:27,050
this is not the correspondence
because there's an A.

267
00:15:27,050 --> 00:15:31,930
So we chance a half, whatever
graph you send here the prover

268
00:15:31,930 --> 00:15:35,440
cannot satisfy the
verifier's request.

269
00:15:35,440 --> 00:15:40,150
And if they do this 2 times,
the prover has only a chance

270
00:15:40,150 --> 00:15:43,310
1 and 1/4 of
satisfying to verifier.

271
00:15:43,310 --> 00:15:46,360
And if you do it the K times,
its chances are 1 and 2

272
00:15:46,360 --> 00:15:49,190
to the K. And that's
a very small number.

273
00:15:49,190 --> 00:15:52,600
If K is 300 2 to the 300
is larger than the number

274
00:15:52,600 --> 00:15:56,930
of electrons in our universe.

275
00:15:56,930 --> 00:15:57,430
OK.

276
00:15:57,430 --> 00:16:00,320


277
00:16:00,320 --> 00:16:04,670
And now they say, why do we
go to the trouble of such

278
00:16:04,670 --> 00:16:06,380
a complicated truth?

279
00:16:06,380 --> 00:16:09,050
But not too complicated,
but nonetheless is not

280
00:16:09,050 --> 00:16:11,240
a stretch for understanding
the correspondence.

281
00:16:11,240 --> 00:16:14,420
Because the proof I just talk
about is a zero knowledge

282
00:16:14,420 --> 00:16:15,770
proof.

283
00:16:15,770 --> 00:16:18,780
And why is it zero knowledge?

284
00:16:18,780 --> 00:16:22,250
Well, because at the
end of the proof,

285
00:16:22,250 --> 00:16:26,280
the verifier knows that A
and B are the same graph.

286
00:16:26,280 --> 00:16:30,110
That's what the proof
want to give it away.

287
00:16:30,110 --> 00:16:31,970
But what else does
he know at the end?

288
00:16:31,970 --> 00:16:35,090
He also knows which
words he and the prover

289
00:16:35,090 --> 00:16:36,500
exchange with one another.

290
00:16:36,500 --> 00:16:37,820
He was there.

291
00:16:37,820 --> 00:16:42,270
Listen, now I claim
but from these words,

292
00:16:42,270 --> 00:16:45,320
you can retrieve that A and
B are the same graph but you

293
00:16:45,320 --> 00:16:48,500
cannot retrieve anything else.

294
00:16:48,500 --> 00:16:50,990
And how am I going to say that?

295
00:16:50,990 --> 00:16:53,480
Actually, how am I
going to prove that?

296
00:16:53,480 --> 00:16:57,280


297
00:16:57,280 --> 00:16:59,690
Well, here it is how.

298
00:16:59,690 --> 00:17:02,420
Let's assume now
that the verifier

299
00:17:02,420 --> 00:17:07,609
knows priori with A and
B are isomorphic graphs.

300
00:17:07,609 --> 00:17:08,510
This is single beat.

301
00:17:08,510 --> 00:17:11,510
He only knows that A
and B are missing graph.

302
00:17:11,510 --> 00:17:15,079
Now, he claimed that the
proof is relevant for him

303
00:17:15,079 --> 00:17:18,290
because he can come up
with the proof himself.

304
00:17:18,290 --> 00:17:19,069
Here is how.

305
00:17:19,069 --> 00:17:22,740


306
00:17:22,740 --> 00:17:27,780
So the verifier now
first flips a coin

307
00:17:27,780 --> 00:17:33,870
to decide between A or B. And
let's assume that the coin has

308
00:17:33,870 --> 00:17:38,550
come up A. Therefore, given
the coin has come up A,

309
00:17:38,550 --> 00:17:43,300
he scrambles by
some correspondence

310
00:17:43,300 --> 00:17:48,640
pi, the vertices of A
and creates a graph C.

311
00:17:48,640 --> 00:17:52,070
So he comes up with a
renaming of the vertices of A,

312
00:17:52,070 --> 00:17:55,390
renames the vertices and
comes up with a graph C.

313
00:17:55,390 --> 00:17:59,260
Then is going to lay down
with the following words.

314
00:17:59,260 --> 00:18:01,570
C with an arrow
from left to right.

315
00:18:01,570 --> 00:18:05,260
A with an arrow
from right to left.

316
00:18:05,260 --> 00:18:09,020
And pi with an arrow
from left to right.

317
00:18:09,020 --> 00:18:10,880
End of the quotes.

318
00:18:10,880 --> 00:18:14,140
Now, I claim that
this conversation

319
00:18:14,140 --> 00:18:16,250
is equally likely is
essentially the same

320
00:18:16,250 --> 00:18:18,230
with this conversation.

321
00:18:18,230 --> 00:18:19,040
Why?

322
00:18:19,040 --> 00:18:22,930
Well, the first entry,
the first message here,

323
00:18:22,930 --> 00:18:28,990
is a randomly scrambled version
of A and B. And so is here.

324
00:18:28,990 --> 00:18:32,700
The second message here is
the result of a coin toss.

325
00:18:32,700 --> 00:18:34,240
And so is here.

326
00:18:34,240 --> 00:18:37,470
Here, we noticed that the coin
we flipped after seeing C,

327
00:18:37,470 --> 00:18:39,240
and here we flipped up before.

328
00:18:39,240 --> 00:18:41,880
But none of it is
a random coin toss.

329
00:18:41,880 --> 00:18:43,290
And finally, what is here?

330
00:18:43,290 --> 00:18:47,032
Here is a correspondence between
vertices of A and vertices of C

331
00:18:47,032 --> 00:18:48,360
that preserve the edges.

332
00:18:48,360 --> 00:18:50,680
And so is here.

333
00:18:50,680 --> 00:18:55,560
So you see why this mechanism
proves to you that A and B are

334
00:18:55,560 --> 00:18:58,350
isomorphic graphs, if
you knew this beforehand,

335
00:18:58,350 --> 00:19:01,440
this mechanism doesn't
add anything else.

336
00:19:01,440 --> 00:19:05,070
Because you could reconstruct
the conversations yourself

337
00:19:05,070 --> 00:19:08,770
and whatever you can do by
yourself is your knowledge.

338
00:19:08,770 --> 00:19:11,610
So this proof is a zero
knowledge on top of the fact

339
00:19:11,610 --> 00:19:14,950
that A and B are the same graph.

340
00:19:14,950 --> 00:19:19,230
So I've showed you basically a
way for the graph isomorphism

341
00:19:19,230 --> 00:19:25,310
problem that convinces
you that the proof exists

342
00:19:25,310 --> 00:19:28,550
even though you don't even
know how the proof starts.

343
00:19:28,550 --> 00:19:31,970
And this is just a small
piece of this theory

344
00:19:31,970 --> 00:19:35,870
of zero knowledge
which goes on to answer

345
00:19:35,870 --> 00:19:41,810
many other questions not
fundamental to our mathematical

346
00:19:41,810 --> 00:19:42,620
thinking.

347
00:19:42,620 --> 00:19:46,520
Namely, what does it
mean to prove something.

348
00:19:46,520 --> 00:19:50,430
What does it mean that
something is false?

349
00:19:50,430 --> 00:19:53,040
What does it mean
to convey knowledge?

350
00:19:53,040 --> 00:19:56,170
What does it mean to be
totally unintelligible?

351
00:19:56,170 --> 00:19:59,700


352
00:19:59,700 --> 00:20:04,020
Now, a lot of
leverage that you can

353
00:20:04,020 --> 00:20:08,130
do if you answer
correctly such a question.

354
00:20:08,130 --> 00:20:12,390
And in fact, look, you don't go
to bovver the classical notion

355
00:20:12,390 --> 00:20:16,110
of a proof which has been there
for 500 years from nothing.

356
00:20:16,110 --> 00:20:18,750
Because once you are large
such powerful notions

357
00:20:18,750 --> 00:20:19,950
but very powerful.

358
00:20:19,950 --> 00:20:21,850
You just walk
outside even crazier

359
00:20:21,850 --> 00:20:25,290
because it was built using
mathematical formulas

360
00:20:25,290 --> 00:20:28,377
and prove that it stood before
people really put together.

361
00:20:28,377 --> 00:20:30,210
So once if you tackle
this notion of proofs,

362
00:20:30,210 --> 00:20:31,930
you gain power.

363
00:20:31,930 --> 00:20:36,690
And in fact, this theory has
applications in practice.

364
00:20:36,690 --> 00:20:40,350
And these applications
are nothing less

365
00:20:40,350 --> 00:20:44,160
of privacy of communication.

366
00:20:44,160 --> 00:20:50,400
Ability of checking programs,
checking program or designing

367
00:20:50,400 --> 00:20:53,460
programs so that they are
much more reliable than we

368
00:20:53,460 --> 00:20:54,700
could do before.

369
00:20:54,700 --> 00:20:55,930
And I don't have the time.

370
00:20:55,930 --> 00:20:59,490
So if it's a question that lies
at the heart of our computer

371
00:20:59,490 --> 00:21:03,420
age, but I don't have a
question to illustrate this.

372
00:21:03,420 --> 00:21:07,290
But in the last two
minutes left to me,

373
00:21:07,290 --> 00:21:12,600
I'm going perhaps to take
just a small piece and one

374
00:21:12,600 --> 00:21:13,390
small piece.

375
00:21:13,390 --> 00:21:22,830
I chose is the
problem of passwords.

376
00:21:22,830 --> 00:21:26,150
Passwords is an ancient problem.

377
00:21:26,150 --> 00:21:29,070
Or rather a problem
is an ancient tool.

378
00:21:29,070 --> 00:21:32,580
And I was rereading to
see it is a while ago

379
00:21:32,580 --> 00:21:36,330
and with the continuous
source of inspiration for me.

380
00:21:36,330 --> 00:21:39,810


381
00:21:39,810 --> 00:21:45,600
And he was describing a famous
battle between the Athenians

382
00:21:45,600 --> 00:21:49,817
and the Syracusans,
roughly around 400 BC.

383
00:21:49,817 --> 00:21:51,400
And what is strange
about this battle,

384
00:21:51,400 --> 00:21:52,770
the battle was a nighttime.

385
00:21:52,770 --> 00:21:55,350
So they couldn't-- the two
armies couldn't see each other

386
00:21:55,350 --> 00:21:57,030
and they couldn't
recognize each other.

387
00:21:57,030 --> 00:22:00,090
Therefore, the Athenians
had the password.

388
00:22:00,090 --> 00:22:02,860
But they were shouting with
passwords back and forth

389
00:22:02,860 --> 00:22:07,890
so much that the Syracusans
got hold of the password

390
00:22:07,890 --> 00:22:10,110
and they started
shouting also, and they

391
00:22:10,110 --> 00:22:13,110
passed through their lines and
then attacked them on the back

392
00:22:13,110 --> 00:22:15,430
and defeated them.

393
00:22:15,430 --> 00:22:21,210
Now, this problem
does not arise just

394
00:22:21,210 --> 00:22:23,410
because we are so back in time.

395
00:22:23,410 --> 00:22:25,950
This is a problem
and today's also.

396
00:22:25,950 --> 00:22:27,900
If you use passwords
for instance

397
00:22:27,900 --> 00:22:30,870
to login in the morning
from your home terminal

398
00:22:30,870 --> 00:22:34,380
to the mainframe,
how does this work?

399
00:22:34,380 --> 00:22:35,760
It's the same thing.

400
00:22:35,760 --> 00:22:39,210
Well, the mainframe has a list
of users and their passwords

401
00:22:39,210 --> 00:22:42,420
in my case, for president.

402
00:22:42,420 --> 00:22:45,120
Now, which my friends
do guests but they

403
00:22:45,120 --> 00:22:47,370
don't guess the number
of exclamative points.

404
00:22:47,370 --> 00:22:51,750


405
00:22:51,750 --> 00:22:57,240
So if the right
password is introduced,

406
00:22:57,240 --> 00:22:59,360
then the mainframe
says, yes this

407
00:22:59,360 --> 00:23:04,070
must be Silvio lets this
guy have access to my files.

408
00:23:04,070 --> 00:23:05,930
But what is the problem
with this scheme?

409
00:23:05,930 --> 00:23:09,110
Well, the problem is
that listening on that

410
00:23:09,110 --> 00:23:10,820
may be a bad guy.

411
00:23:10,820 --> 00:23:13,130
Therefore he captures
my passwords very much

412
00:23:13,130 --> 00:23:14,400
like this occlusions.

413
00:23:14,400 --> 00:23:18,650
And he's going next
today to impersonate me.

414
00:23:18,650 --> 00:23:22,490
And that is intrinsic in
the nature of a password.

415
00:23:22,490 --> 00:23:28,010
If somebody figures out
your password, you're dead.

416
00:23:28,010 --> 00:23:34,100
From 400 we see up to
now but not anymore.

417
00:23:34,100 --> 00:23:35,750
Let me tell you why.

418
00:23:35,750 --> 00:23:38,540
And here is our zero
knowledge comes to the rescue.

419
00:23:38,540 --> 00:23:41,830


420
00:23:41,830 --> 00:23:44,230
You see in the
mainframe, just to keep

421
00:23:44,230 --> 00:23:45,910
it simple for the
sake of example

422
00:23:45,910 --> 00:23:49,720
I'm going to store close to
my name a pair of graphs,

423
00:23:49,720 --> 00:23:52,270
say files on the vertices.

424
00:23:52,270 --> 00:23:55,670
And they're copies,
different copies,

425
00:23:55,670 --> 00:23:58,040
scrambled copies
of the same graph.

426
00:23:58,040 --> 00:23:59,710
So A and B are isomorphic.

427
00:23:59,710 --> 00:24:03,010
But I do not store in the
mainframe the isomorphism so

428
00:24:03,010 --> 00:24:04,690
as the parentheses indicate.

429
00:24:04,690 --> 00:24:06,310
I do not store the
correspondence.

430
00:24:06,310 --> 00:24:08,740
I will keep the correspondence.

431
00:24:08,740 --> 00:24:14,080
And now I tell the
mainframe who proves

432
00:24:14,080 --> 00:24:17,530
to you that A is the same
graph and B, that's me.

433
00:24:17,530 --> 00:24:19,630
Let him have access to my files.

434
00:24:19,630 --> 00:24:22,810


435
00:24:22,810 --> 00:24:28,010
And now, in the morning, which
will take a while but not.

436
00:24:28,010 --> 00:24:31,910


437
00:24:31,910 --> 00:24:35,090
I'm going to wake up, scramble--

438
00:24:35,090 --> 00:24:38,300
give a random scramble
version C of A and B

439
00:24:38,300 --> 00:24:40,160
and ship it over
to the mainframe.

440
00:24:40,160 --> 00:24:44,570
The mainframe flips a coin
to decide between A and B,

441
00:24:44,570 --> 00:24:46,520
say he decides A.
And therefore he

442
00:24:46,520 --> 00:24:49,550
request the correspondence
between A and C,

443
00:24:49,550 --> 00:24:51,170
and they give him the response.

444
00:24:51,170 --> 00:24:55,140
And the mainframe says,
OK, so far so good.

445
00:24:55,140 --> 00:24:57,770
This is a correspondence
between A and C.

446
00:24:57,770 --> 00:25:02,690
And now, we are going
to do it 20 times.

447
00:25:02,690 --> 00:25:05,240
It will take a while.

448
00:25:05,240 --> 00:25:07,590
But what do I gain by this?

449
00:25:07,590 --> 00:25:12,110
You see what I gain
is, if somebody now

450
00:25:12,110 --> 00:25:15,920
is listening to
these conversations

451
00:25:15,920 --> 00:25:18,240
he is rather confused.

452
00:25:18,240 --> 00:25:18,740
Why?

453
00:25:18,740 --> 00:25:21,980
Because this proof
I already argued

454
00:25:21,980 --> 00:25:24,330
before is a zero
knowledge proof.

455
00:25:24,330 --> 00:25:27,890
So only it betrays is the fact
that A and B are the same,

456
00:25:27,890 --> 00:25:30,920
but not to what the
correspondence is.

457
00:25:30,920 --> 00:25:32,540
So even though
somebody is listening

458
00:25:32,540 --> 00:25:34,940
to this identification
process, he

459
00:25:34,940 --> 00:25:37,520
has no way to impersonate me.

460
00:25:37,520 --> 00:25:40,980
And just to give you
just a simple example,

461
00:25:40,980 --> 00:25:45,860
assume this guy records
what graph C I send over.

462
00:25:45,860 --> 00:25:48,600
And then in the
previous conversation

463
00:25:48,600 --> 00:25:52,970
the mainframe has asked me for
A. So he sends over the same C,

464
00:25:52,970 --> 00:25:55,430
assume that the computer chooses
at random between A and B

465
00:25:55,430 --> 00:25:57,200
and he's lucky.

466
00:25:57,200 --> 00:25:59,420
He chooses the
same A. Therefore,

467
00:25:59,420 --> 00:26:02,120
he can send the
same pi that I sent.

468
00:26:02,120 --> 00:26:05,780
But for the second graph,
the computer originally

469
00:26:05,780 --> 00:26:10,640
say you ask me for A. But
now, he's asking him for B,

470
00:26:10,640 --> 00:26:13,130
then he doesn't know
what cosponsorship say.

471
00:26:13,130 --> 00:26:14,330
You see what it is?

472
00:26:14,330 --> 00:26:17,960
Essentially this is a
trick to shrink down

473
00:26:17,960 --> 00:26:23,140
an exponential number of
questions to very very few.

474
00:26:23,140 --> 00:26:25,610
Good.

475
00:26:25,610 --> 00:26:30,920
Now, this is not practical by
any stretch of imagination.

476
00:26:30,920 --> 00:26:37,030
However, this does not
bother us as theoreticians.

477
00:26:37,030 --> 00:26:40,120
But let me tell you why he does
not bother us as theoreticians.

478
00:26:40,120 --> 00:26:43,510
Because now, once you know
what you are shooting for,

479
00:26:43,510 --> 00:26:46,960
and this is the last
slide of my talk,

480
00:26:46,960 --> 00:26:49,490
then you can make
it efficient later.

481
00:26:49,490 --> 00:26:52,660
And in fact, not using
graph isomorphism

482
00:26:52,660 --> 00:26:57,100
but using number theory, the
same identification process

483
00:26:57,100 --> 00:27:00,550
much more efficient
has been devised

484
00:27:00,550 --> 00:27:02,860
by Gold Wasser and
myself and Rackoff

485
00:27:02,860 --> 00:27:05,950
and then improved by
Fiat, Feige and Shamir.

486
00:27:05,950 --> 00:27:10,030
And finally, Shamir and I have
a zero knowledge identification

487
00:27:10,030 --> 00:27:13,000
scheme by which in the
morning to identify yourself,

488
00:27:13,000 --> 00:27:16,960
all you need to have is
two multiplications, which

489
00:27:16,960 --> 00:27:19,175
I consider it practical.

490
00:27:19,175 --> 00:27:22,150
Now, for the future.

491
00:27:22,150 --> 00:27:28,660
Well, we have only explored
the tip of the iceberg

492
00:27:28,660 --> 00:27:30,560
and there is a lot
of other things

493
00:27:30,560 --> 00:27:34,190
that needs to be discovered
in really farther down,

494
00:27:34,190 --> 00:27:35,970
and we want to
bring it to light.

495
00:27:35,970 --> 00:27:39,340
Let me tell you a waiver of
it, thanks to zero knowledge,

496
00:27:39,340 --> 00:27:42,820
cryptography is
undergoing a transition

497
00:27:42,820 --> 00:27:45,430
from an art to a science.

498
00:27:45,430 --> 00:27:45,970
Thank you.

499
00:27:45,970 --> 00:27:48,070
[APPLAUSE]

500
00:27:48,070 --> 00:27:50,820
[MUSIC PLAYING]

501
00:27:50,820 --> 00:28:15,000