Disclaimer: contains maths
If you don’t understand something
- Not your fault, this stuff is hard
- Nobody understands it fully
If you don’t understand anything
- My fault, anything can be explained at some level
If you do understand everything
- Collect your Turing Award & Fields Medal
It is difficult to find zero knowledge resources that avoid the extremes of either over simplyfying the subject, or presenting so much mathematical detail that the reader gets bogged down and loses interest.
The purpose of this tutorial then is to find an accessible but informative middle ground.
We start with some examples to show how zero knowledge proofs can proceed, and the situations where they could be used.
Colour blind verifier
This is an interactive proof showing that the prover can distinguish between a red and a green billiard ball, whereas the verifier cannot distinguish them.
• The prover wants to show the verifier that they have different colours but does not want him to learn which is red and which is green.
• Step 1: The verifier takes the balls, each one in each hand, holds them in front of the prover and then hides them behind his back. Then, with probability 1/2 either swaps them (at most once) or keeps them as they are. Finally, he brings them out in front.
• Step 2: The prover has to say the verifier switched them or not.
• Step 3: Since they have different colours, the prover can always say whether they were switched or not.
But, if they were identical (the verifier is inclined to believe that), the prover would be wrong with probability 1/2.
• Finally, to convince the verifier with very high probability, the prover could repeat Step 1 to Step 3 k times to reduce the probability of the prover being successful by chance to a extremely small amount.
Wheres Wally
Based on the pictures of crowds where Wally is distinctivly dressed, the aim being to find him amoungst a sea of similar people.
The proof procedes as follows :
Imagine the Peggy has found Wally in the pricture and wants to prove this fact to Victor, however if she just shows him, Victor is liable to cheat and claim he also found Wally.
In order to prove to Victor that she has indeed found Wally, without givin away his loaction in the picture
Sudoku
An interactive proof can be created to prove the knowledge of a solution to a sudoku puzzle by placing cards in the sudoku grid. The process is descrived here Sudoku Proof
Now that we an intuitive grasp of zero knowledge proofs, lets go into the details by looking at a certain type of proof system, a zk-SNARK.
The process of creating and using a zk-SNARK can be summarised as
A zk-SNARK consists of three algorithms $G$, $$, $V$ defined as follows:
The Creator takes a secret parameter lambda and a program $C$, and generates two publicly available keys:
These keys are public parameters that only need to be generated once for a given program $C$.
The prover Peggy takes a proving key $pk$, a public input $x$ and a private witness $w$.
Peggy generates a proof $pr$ $=P(pk,x,w)$ that claims that Peggy knows a witness $w$ and that the witness satisfies the program $C$.
The verifier Victor computes $V(vk,x,pr)$ which returns true if the proof is correct, and false otherwise.
Thus this function returns true if Peggy knows a witness $w$ satisfying
$C(x,w)=true$
Note here the secret parameter lambda This parameter sometimes makes it tricky to use zk-SNARK in real-world applications. The reason for this is that anyone who knows this parameter can generate fake proofs. Specifically, given any program $C$ and public input $x$ a person who knows lambda can generate a proof $pr2$ such that $V(vk,x,pr2)$ evaluates to true without knowledge of the secret $w$.
We can see this workflow in Zokrates, a project which allows you to create zx-SNARKS on the Ethereum blockchain.
The Creator writes and compiles a program in the Zokrates DSL
The Creator / Prover generates a trusted setup for the compiled program
The Prover computes a witness for the compiled program
The Prover generates a proof - Using the proving key, she generates a proof for a computation of the compiled program
The Creator / Prover exports a Verifier - Using the verifying key she generates a Solidity contract which contains the generated verification key and a public function to verify a solution to the compiled program
That gives us an overview of the process, lets now turn to the theoretical underpinnings, starting with a look at what it means to have a proof.
A statement is a proposition we want to prove. It depends on:
Given the instance variables, we can find a short proof that we know witness variables that make the statement true (possibly without revealing any other information).
What do we require of a proof ?
Completeness: there exists an honest prover P that can convince the honest verifier V of any correct statement with very high probability.
Soundness: even a dishonest prover P running in super-polynomial time cannot convince an honest verifier V of an incorrect statement. Note: P does not necessarily have to run in polynomial time, but V does.
To make our proof zero knowledge we also need ‘zero knowledginess’
To oversimplify: represented on a computer, a ZKP is nothing more than a sequence of numbers, carefully computed by Peggy, together with a bunch of boolean checks that Victor can run in order to verify the proof of correctness for the computation.
A zero-knowledge protocol is thus the mechanism used for deriving these numbers and defining the verification checks.
Non-interactivity is only useful if we want to allow multiple independent verifiers to verify a given proof without each one having to individually query the prover.
In contrast, in non-interactive zero knowledge protocols there is no repeated communication between the prover and the verifier. Instead, there is only a single “round”, which can be carried out asynchronously.
Using publicly available data, Peggy generates a proof, which she publishes in a place accessible to Victor (e.g. on a distributed ledger).
Following this, Victor can verify the proof at any point in time to complete the “round”. Note that even though Peggy produces only a single proof, as opposed to multiple ones in the interactive version, the verifier can still be certain that except for negligible probability, she does indeed know the secret she is claiming.
Succinctness is necessary only if the medium used for storing the proofs is very expensive and/or if we need very short verification times.
A proof of knowledge is stronger and more useful than just proving the statement is true. For instance, it allows me to prove that I know a secret key, rather than just that it exists.
In a proof, the soundness holds against a computationally unbounded prover and in an argument, the soundness only holds against a polynomially bounded prover.
Arguments are thus often called “computationally sound proofs”.
The Prover and the Verifier have to agree on what they’re proving. This means that both know the statement that is to be proven and what the inputs to this statement represent.
For a cyclic group ${G}_{q}$ of order $q$ with generator $g$.
In order to prove knowledge of $x={\mathrm{log}}_{g}y$, the prover interacts with the verifier as follows:
In the first round the prover commits himself to randomness $r$ therefore the first message $t={g}^{r}$ is also called commitment.
The verifier replies with a challenge $c$ chosen at random.
After receiving $c$, the prover sends the third and last message (the response) $s=r+cx$
The verifier accepts, if ${g}^{s}=t{y}^{c}$
Protocols which have the above three-move structure (commitment, challenge and response) are called sigma protocols
Lets look first at transforming the problem into a QAP, there are
3 steps :
We are aiming to create arithmetic and / or boolean circuits from our code, so we change the high level language into a sequence of statements that are of two forms
x = y (where y can be a variable or a number)
and
x = y (op) z (where op can be +, -, *, / and y and z can be variables, numbers or themselves sub-expressions).
For example we go from
def qeval(x):
y = x**3
return x + y + 5
to
sym_1 = x * x
y = sym_1 * x
sym_2 = y + x
~out = sym_2 + 5
(From http://coders-errand.com/constraint-systems-for-zk-snarks/)
The important thing to understand is that a R1CS is not a computer program, you are not asking it to produce a value from certain inputs. Instead, a R1CS is a verifier, it shows that an already complete computation is correct .
The arithmetc circuit is a compositon of multplicatve sub-circuits (a single multiplcation gate and mutiple addition gates)
A rank 1 constraint system is a set of these sub-circuits expressed as constraints, each of the form:
$AXB=C$
where $A,B,C$ are each linear combinations c1· v1+ c2· v2+ …
The ${c}_{i}$ are constant field elements, and the ${v}_{i}$ are instance or witness variables (or 1).
The inputs and outputs of a subcircuit are not predetermined.
More generally, an implementation of $x=f(a,b)$ doesn’t mean that x is
computed from a and b, just that x, a, and b are consistent.
Constraint languages can be viewed as a generalization of
functional languages:
Thus our R1CS contains :
The R1CS has
Assume P wants to prove to V he knows
c1, c2, c3 such that (c1 · c2) · (c1 + c3) = 7
We transform the expression above into an arithmetic circuit as depicted below
Definition: A legal assignment for the circuit is of the form:
(c1, . . . , c5), where c4 = c1 · c2 and c5 = c4 · (c1 + c3).
Gennaro et al.showed that circuit satisfiability can be efficiently reduced to QAP satisfiability so we will use a Quadratic Arithmetic Program to verify circuit computation
Our goal is to devise a set of polynomials that simultaneously encode all of the constraints, so that we can verify the satisfiability thereof with a single check on the polynomials instead of a check over each constraint. The clever trick is to build the polynomials in a way that they can generate all of the constraints.
The R1CS is initially encoded as vectors and from these we define poynomials
We are aiming to create a left, right, and output polynomial of degree $m$ for $m$ constraints
and a target polynomial of degree $d$
Given fixed values
$({c}_{1},\dots ,{c}_{5})$
we use them as coefficients to define a left, right, and output polynomials.
We want to show that our created QAP has a legal assignment,
From our example above each multiplication gate X has an associated target point in the field ${\mathbb{F}}_{p}$ . In our example, they will be 1 and 2.
• Each wire (in our example, there are 5) has an associated left, right and
output polynomials: .
• Each polynomial evaluates to either 0 or 1 on the target points. If the polynomial is “related” to the multiplication gate associated with a target point, then it evaluates to 1. Otherwise to 0.
we define our left. right and output polynomials as
Our QAP Q of degree d and size m consists of polynomials
${L}_{1},\dots ,{L}_{m},{R}_{1},\dots ,{R}_{m},{O}_{1},\dots ,{O}_{m}$ and a target polynomial $T$ of degree d.
An assignment (c1,…,cm) satisfies Q if,
defining
and
We moved from a situation where we had d groups of 3 * d vectors of m + 1 coefficients to one where we have 3 vectors each with m+1 polynomials of d-1 degree.
We converted a set of vectors into polynomials that generate them when evaluated at certain fixed points. We used these fixed points to generate a vanishing polynomial that divides any polynomial that evaluates to 0 at least on all those points. We created a new polynomial that summarizes all constraints and a particular assignment, and the consequence is that we can verify all constraints at once if we can divide that polynomial by the vanishing one without remainder. This division is complicated, but there are methods (the Fast Fourier Transform) that can perform if efficiently.
From http://coders-errand.com/how-to-build-a-quadratic-arithmetic-program/
Suppose Alice has a polynomial P of degree d , and Bob has a point $s\in {\mathbb{F}}_{p}$ that he chose randomly.
Bob wishes to learn E(P(s)), i.e., the HH of the evaluation of P at s
Two simple ways to do this are:
Alice sends P to Bob, and he computes E(P(s))by himself.
Bob sends s to Alice; she computes E(P(s)) and sends it to Bob.
However, in the blind evaluation problem we want Bob to learn E(P(s)) without learning P
which precludes the first option; and, most importantly,
we don’t want Alice to learn s, which rules out the second
Using HH, we can perform blind evaluation as follows.
Note that, as only hidings were sent, neither Alice learned $s$ nor Bob learned $P$
The rough intuition is that the verifier has a “correct” polynomial in mind, and wishes to check the prover knows it. Making the prover blindly evaluate their polynomial at a random point not known to them, ensures the prover will give the wrong answer with high probability if their polynomial is not the correct one. This, in turn, relies on the Schwartz-Zippel Lemma stating that “different polynomials are different at most points”.
but
the fact that Alice is able to compute E(P(s)) does not guarantee she will indeed send E(P(s)) to Bob, rather than some completely unrelated value.
Our process then becomes
Alice chooses polynomials L,R,O,H
Bob chooses a random point s ∈ ${\mathbb{F}}_{p}$, and computes E(T(s))
Alice sends Bob the hidings of all these polynomials evaluated at s, i.e. E(L(s)),E(R(s)),E(O(s)),E(H(s))
Bob checks if the desired equation holds at s. That is, he checks whether E(L(s)⋅R(s)−O(s))=E(T(s)⋅H(s)).
Furthemore we use
If Alice does not have a satisfying assignment, she will end up using polynomials where the equation does not hold identically, and thus does not hold at most choices of $s$ Therefore, Bob will reject with high probability over his choice of $s$.
We now need to make our proof non interactive, for this we produce the Common Reference String from Bob’s first message
For details of these see : https://z.cash/blog/snark-explain6/
From the useful
https://github.com/gluk64/awesome-zero-knowledge-proofs
Zero-Knowledge (Scalable|Succint) Transparent Arguments of Knowledge
zk-STARKs were created by Eli-Ben Sasson, a professor at the Technion-Israel Institute of Technology.
They generally, considered a more efficient variant of the technology - potentially faster, but the proof size is much greater.
In zk-STARKs, there is no external trusted setup phase and the randomness used is public information.
More resources available at starkware.co
https://crypto.stanford.edu/bulletproofs/
https://eprint.iacr.org/2017/1066.pdf
Aim to prove that asecret committed value lies in a given interval.
Bulletproofs require no trusted setup. However, verifying a bulletproof is more time consuming than verifying a SNARK proof.
They rely only on the discrete logarithm assumption, and are made non-interactive using the Fiat-Shamir heuristic
a general drop-in replacement for Sigma-protocols.
Bulletproofs shrink the size of the cryptographic proof from over 10kB to less than 1kB. Moreover, bulletproofs support proof aggregation, so that proving that m transaction values are valid adds only O(log(m)) additional elements to the size of a single proof.
They concentrate on confidential transactions, but can be for general NP langs
The proof size islogarithmicin the numberof multiplication gates in the arithmetic circuit for verifying a witness. The proofs are much shorter than zk-SNARK and allow inputs to be Pedersen commitments to elements of the witness.
The high level idea of this proving system is that
DIZK provides Java-based implementations using Apache Spark [Apa17] for:
There is a problem that the various libraries aren’t standardised.
jsnark - Java bindings to libsnark
http://coders-errand.com/zero-knowledge-proofs-a-laymans-introduction/
https://blog.decentriq.ch/zk-snarks-primer-part-one/
http://coders-errand.com/zk-snarks-and-their-algebraic-structure/