Zero-Knowledge Proofs - Theory

Quote from Remco Bloemen remco@0x.org

Disclaimer: contains maths

If you don’t understand something

If you don’t understand anything

If you do understand everything

Examples to give an Intuitive grasp of zero-knowledge proofs

Colour blind verifier

An interactive proof showing that the prover can distinguish between a red and a green billiard ball, whereas the verifier cannot distinguish them.

• The prover wants to show the verifier that they have different colours but does not want him to learn which is red and which is green.

• Step 1: The verifier takes the balls, each one in each hand, holds them in front of the prover and then hides them behind his back. Then, with probability 1/2 either swaps them (at most once) or keeps them as they are. Finally, he brings them out in front.

• Step 2: The prover has to say the verifier switched them or not.

• Step 3: Since they have different colours, the prover can always say whether they were switched or not.
But, if they were identical (the verifier is inclined to believe that), the prover would be wrong with probability 1/2.

• Finally, to convince the verifier with very high probability, the prover could repeat Step 1 to Step 3 k times to reduce the probability of the prover being successful by chance to a extremely small amount.

Wheres Wally

A proof based on a large sheet of paper through which Wally can be seen in a picture.

Sudoku

(https://blog.computationalcomplexity.org/2006/08/zero-knowledge-sudoku.html)

Actors in a Zero Knowledge Proof

zk-SNARKS

A zk-SNARK consists of three algorithms G, P, V defined as follows:

The Creator takes a secret parameter lambda and a program C, and generates two publicly available keys, a proving key pk, and a verification key vk. These keys are public parameters that only need to be generated once for a given program C.

The prover Peggy takes a proving key pk, a public input x and a private witness w. Peggy generates a proof prf=P(pk,x,w) that claims that Peggy knows a witness w and that the witness satisfies the program C.

The verifier Victor computes V(vk,x,prf) which returns true if the proof is correct, and false otherwise. Thus this function returns true if Peggy knows a witness w satisfying

C(x,w)=true

Trusted Setups and Toxic Waste

Note here the secret parameter lambda This parameter sometimes makes it tricky to use zk-SNARK in real-world applications. The reason for this is that anyone who knows this parameter can generate fake proofs. Specifically, given any program C and public input x a person who knows lambda can generate a proof fake_prf such that V(vk, x, fake_prf) evaluates to true without knowledge of the secret w.

A preview of the process flow in Zokrates

From Proof Systems to ZKP Systems

Proving Systems

A statement is a proposition we want to prove. It depends on:

Given the instance variables, we can find a short proof that we know witness variables that make the statement true (possibly without revealing any other information).

What do we require of a proof ?

For a Zero Knowledge Proof we also need ‘zero knowledginess’

To oversimplify: represented on a computer, a ZKP is nothing more than a sequence of numbers, carefully computed by Peggy, together with a bunch of boolean checks that Victor can run in order to verify the proof of correctness for the computation.

A zero-knowledge protocol is thus the mechanism used for deriving these numbers and defining the verification checks.

Interactive v Non Interactive Proofs

Non-interactivity is only useful if we want to allow multiple independent verifiers to verify a given proof without each one having to individually query the prover.

In contrast, in non-interactive zero knowledge protocols there is no repeated communication between the prover and the verifier. Instead, there is only a single “round”, which can be carried out asynchronously.
Using publicly available data, Peggy generates a proof, which she publishes in a place accessible to Victor (e.g. on a distributed ledger).
Following this, Victor can verify the proof at any point in time to complete the “round”. Note that even though Peggy produces only a single proof, as opposed to multiple ones in the interactive version, the verifier can still be certain that except for negligible probability, she does indeed know the secret she is claiming.

Succint v Non Succint

Succinctness is necessary only if the medium used for storing the proofs is very expensive and/or if we need very short verification times.

Proof v Proof of Knowledge

A proof of knowledge is stronger and more useful than just proving the statement is true. For instance, it allows me to prove that I know a secret key, rather than just that it exists.

Argument v Proof

In a proof, the soundness holds against a computationally unbounded prover and in an argument, the soundness only holds against a polynomially bounded prover.
Arguments are thus often called “computationally sound proofs”.

The Prover and the Verifier have to agree on what they’re proving. This means that both know the statement that is to be proven and what the inputs to this statement represent.

Schnoor Protocol

For a cyclic group Gq of order q with generator g.
In order to prove knowledge of x=loggy, the prover interacts with the verifier as follows:

In the first round the prover commits himself to randomness r therefore the first message t=gr is also called commitment.
The verifier replies with a challenge c chosen at random.
After receiving c, the prover sends the third and last message (the response) s=r+cx
The verifier accepts, if gs=tyc

Protocols which have the above three-move structure (commitment, challenge and response) are called sigma protocols


Transforming our Problem from High Level Language to ZKP

Lets look first at transforming the problem into a QAP, there are
3 steps :

Code Flattening

We are aiming to create arithmetic and / or boolean circuits from our code, so we change the high level language into a sequence of statements that are of two forms

x = y (where y can be a variable or a number)
and
x = y (op) z (where op can be +, -, *, / and y and z can be variables, numbers or themselves sub-expressions).

For example we go from

def qeval(x):
    y = x**3
    return x + y + 5

to

sym_1 = x * x
y = sym_1 * x
sym_2 = y + x
~out = sym_2 + 5

Rank 1 Constraint Systems

(From http://coders-errand.com/constraint-systems-for-zk-snarks/)

The important thing to understand is that a R1CS is not a computer program, you are not asking it to produce a value from certain inputs. Instead, a R1CS is a verifier, it shows that an already complete computation is correct .

Circuit

A rank 1 constraint system is a set of rank 1 constraints, each of the form:
AXB=C
where A,B,C are each linear combinations c1· v1+ c2· v2+ …
The ci are constant field elements, and the vi are instance or witness variables (or 1).

The inputs and outputs of a subcircuit are not predetermined.

More generally, an implementation of x=f(a,b) doesn’t mean that x is
computed from a and b, just that x, a, and b are consistent.

Constraint languages can be viewed as a generalization of
functional languages:

Example

Assume P wants to prove to V he knows
c1, c2, c3 such that (c1 · c2) · (c1 + c3) = 7
We transform the expression above into an arithmetic circuit as depicted below
Circuit

Definition: A legal assignment for the circuit is of the form:

(c1, . . . , c5), where c4 = c1 · c2 and c5 = c4 · (c1 + c3).

QAP

Gennaro et al.showed that circuit satisfiability can be efficiently reduced to QAP satisfiability so we will use a Quadratic Arithmetic Program to verify circuit computation

Our goal is to devise a set of polynomials that simultaneously encode all of the constraints, so that we can verify the satisfiability thereof with a single check on the polynomials instead of a check over each constraint. The clever trick is to build the polynomials in a way that they can generate all of the constraints.

The QAP itself is made up of 3 polynomial vectors:
PA,PB,PC
corresponding to the linear combinations of the same name in the R1CS
plus one polynomial Z.

Given fixed values
(c1,,c5)
we use them as coefficients to define a left, right, and output “sum” polynomials.
We want to show that our created QAP has a legal assignment,

From our example above each multiplication gate X has an associated target point in the field Fp . In our example, they will be 1 and 2.
• Each wire (in our example, there are 5) has an associated left, right and
output polynomials: .
• Each polynomial evaluates to either 0 or 1 on the target points. If the polynomial is “related” to the multiplication gate associated with a target point, then it evaluates to 1. Otherwise to 0.

The important thing to retain is that since we have d constraints we can compute a polynomial of degree at most d-1 for each entry in the linear combination. And this defines the main part of the QAP:
the PA,PB,PC polynomial vectors.

Z Polynomial

Z is defined as the polynomial that is 0 exactly at the points we chose above, r1 to rd.

Z is defined as (x - 1) * (x - 2) * (x - 3) … - the simplest polynomial that is equal to zero at all points that correspond to logic gates. It is an elementary fact of algebra that any polynomial that is equal to zero at all of these points has to be a multiple of this minimal polynomial, and if a polynomial is a multiple of Z then its evaluation at any of those points will be zero;

P(z) = <PA(z) . s> * <PB(z) . s> - <PC(z) . s>

Now, it is a property of Algebra that a polynomial is equal to the product of linear terms of its roots and a scalar. What this means is that if a polynomial has zeroes at points r1, r2, …, rd, then it can be written as

P(z) = (z - r1) (z - r2) … (z - rd) H(z),
where H(z) is some other polynomial that represents the other places at which P is 0. Now, notice the similarity with our chosen Z(z) polynomial:

Every polynomial P that is 0 at least for the points associated with all the constraints will be necessarily a multiple of Z(z), and can be written as

P(z) = Z(z) * H(z)

We converted a set of vectors into polynomials that generate them when evaluated at certain fixed points. We used these fixed points to generate a vanishing polynomial that divides any polynomial that evaluates to 0 at least on all those points. We created a new polynomial that summarizes all constraints and a particular assignment, and the consequence is that we can verify all constraints at once if we can divide that polynomial by the vanishing one without remainder. This division is complicated, but there are methods (the Fast Fourier Transform) that can perform if efficiently.

From http://coders-errand.com/how-to-build-a-quadratic-arithmetic-program/

Zero Knowledginess

Blind evaluation of a polynomial using Homomorphic Hiding

Suppose Alice has a polynomial P of degree d , and Bob has a point sFp that he chose randomly.
Bob wishes to learn E(P(s)), i.e., the HH of the evaluation of P at s
Two simple ways to do this are:

  1. Alice sends P to Bob, and he computes E(P(s))by himself.

  2. Bob sends s to Alice; she computes E(P(s)) and sends it to Bob.

However, in the blind evaluation problem we want Bob to learn E(P(s)) without learning P
which precludes the first option; and, most importantly,
we don’t want Alice to learn s, which rules out the second

Using HH, we can perform blind evaluation as follows.

  1. Bob sends to Alice the hidings E(1),E(s),…,E(sd).
  2. Alice computes E(P(s)) from the elements sent in the first step, and sends
    E(P(s)) to Bob. (Alice can do this since E supports linear combinations, and P(s)is a linear combination of 1,s,…,sd.)

Note that, as only hidings were sent, neither Alice learned s nor Bob learned P

The rough intuition is that the verifier has a “correct” polynomial in mind, and wishes to check the prover knows it. Making the prover blindly evaluate their polynomial at a random point not known to them, ensures the prover will give the wrong answer with high probability if their polynomial is not the correct one. This, in turn, relies on the Schwartz-Zippel Lemma stating that “different polynomials are different at most points”.

but

the fact that Alice is able to compute E(P(s)) does not guarantee she will indeed send E(P(s)) to Bob, rather than some completely unrelated value.

Remember our QAP Q of degree d and size m consists of polynomials L1,…,Lm, R1,…,Rm, O1,…,Om and a target polynomial T of degree d.

An assignment (c1,…,cm) satisfies Q if, defining
L:=i=1mciLi,R:=i=1mciRi,O:=i=1mciOi and P:=L⋅R−O, we have that T divides P.

Our process then becomes

  1. Alice chooses polynomials L,R,O,H of degree at most d

  2. Bob chooses a random point s ∈ Fp, and computes E(T(s))

  3. Alice sends Bob the hidings of all these polynomials evaluated at s, i.e. E(L(s)),E(R(s)),E(O(s)),E(H(s))

  4. Bob checks if the desired equation holds at s. That is, he checks whether E(L(s)⋅R(s)−O(s))=E(T(s)⋅H(s)).

Furthemore we use

For details of these see : https://z.cash/blog/snark-explain6/


Other Projects

From the useful
https://github.com/gluk64/awesome-zero-knowledge-proofs
Comparisons

Starks

Zero-Knowledge (Scalable|Succint) Transparent Arguments of Knowledge

zk-STARKs were created by Eli-Ben Sasson, a professor at the Technion-Israel Institute of Technology.
They generally, considered a more efficient variant of the technology - potentially faster, but the proof size is much greater.

Trusted Setup

In zk-STARKs, there is no external trusted setup phase and the randomness used is public information.

Vitalik’s Blog Articles

The STARK paper

libSTARK

More resources available at starkware.co

Bulletproofs

https://crypto.stanford.edu/bulletproofs/
https://eprint.iacr.org/2017/1066.pdf

Aim to prove that asecret committed value lies in a given interval.

Bulletproofs require no trusted setup. However, verifying a bulletproof is more time consuming than verifying a SNARK proof.
They rely only on the discrete logarithm assumption, and are made non-interactive using the Fiat-Shamir heuristic

a general drop-in replacement for Sigma-protocols.

Bulletproofs shrink the size of the cryptographic proof from over 10kB to less than 1kB. Moreover, bulletproofs support proof aggregation, so that proving that m transaction values are valid adds only O(log(m)) additional elements to the size of a single proof.

They concentrate on confidential transactions, but can be for general NP langs
The proof size islogarithmicin the numberof multiplication gates in the arithmetic circuit for verifying a witness. The proofs are much shorter than zk-SNARK and allow inputs to be Pedersen commitments to elements of the witness.

The high level idea of this proving system is that

  1. The prover commits to a value(s) that he wants to prove knowledge of
  2. The prover generates the proof by enforcing the constraints over the committed values and any additional public values. The constraints might require the prover to commit to some additional variables.
  3. Prover sends the verifier all the commitments he made in step 1 and step 2 along with the proof from step 2.
  4. The verifier now verifies the proof by enforcing the same constraints over the commitments plus any public values.

DIZK

Dizk Repo
Dizk Paper

DIZK provides Java-based implementations using Apache Spark [Apa17] for:

  1. Proof systems
    • A serial and distributed preprocessing zkSNARK for R1CS (Rank-1 Constraint Systems), an NP-complete language that resembles arithmetic circuit satisfiability. The zkSNARK is the protocol in [Gro16].
    • A distributed Merlin-Arthur proof system for evaluating arithmetic circuits on batches of inputs; see [Wil16].
  2. Scalable arithmetic
    • A serial and distributed radix-2 fast Fourier transform (FFT); see [Sze11].
    • A serial and distributed multi-scalar multiplication (MSM); see [BGMW93] [Pip76] [Pip80].
    • A serial and distributed Lagrange interpolation (Lag); see [BT04].
  3. Applications using the above zkSNARK for
    • Authenticity of photos on three transformations (crop, rotation, blur); see [NT16].
    • Integrity of machine learning models with support for linear regression and covariance matrices; see [Bis06] [Can69] [LRF97] [vW97].

Home

Zcash deep dive

Explaining Snarks Series

Rollup

Repo

Libraries and toolkits

There is a problem that the various libraries aren’t standardised.

libsnark

Bellman

jsnark - Java bindings to libsnark

snarkjs

Snarky

Zokrates

Ethsnarks

Circom - circuit compiler

Useful Resources

Blogs

http://coders-errand.com/zero-knowledge-proofs-a-laymans-introduction/
https://blog.decentriq.ch/zk-snarks-primer-part-one/
http://coders-errand.com/zk-snarks-and-their-algebraic-structure/

Talks from the last workshop

Additional Reading

Hackathon projects