mersenneforum.org > Math VDF (Verifiable Delay Function) and PRP
 Register FAQ Search Today's Posts Mark Forums Read

2019-08-18, 18:30   #12
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

101100000012 Posts

Quote:
 Originally Posted by kriesel sqrt(p) residues or ln(p)? (or log2(p)?)
sqrt(p) residues.

Quote:
 Originally Posted by kriesel Does this verification only confirm the LL run was completed, or does it check its correctness?
You could even mix these strategies, do fast PRP checks on composites, and fast LL checks on proven (Mersenne) primes. You really need to do this, because in
(2+sqrt(3))^(2^n)=U(n)+V(n)*sqrt(3) mod mp
you could compute this only 2 times slower than the regular LL/PRP test!

2019-08-18, 18:41   #13
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

1,409 Posts

Quote:
 Originally Posted by kriesel Does this verification only confirm the LL run was completed, or does it check its correctness?
In my methods we check the final residue by simply reusing the (also) verified b^n0 residue in the ring ( for example b=2,3 or 2+sqrt(3) ), here say n-n0<sqrt(n), so doing a few iterations we can reach the final b^n.

2020-02-04, 09:56   #14
Fan Ming

Oct 2019

5·19 Posts

Quote:
 Originally Posted by preda We call the two actors involved "the author" who does the hard work of computing the PRP and producing the proof, and "the verifier" who receives the proof from the author and checks it. The goal is to convince the verifier that the result of the PRP is genuine and correct even when the verifier does not trust the author. If the author would send the verifier only the PRP result (y), the verifier could re-do the whole PRP computation *again* and thus validate the result (i.e. the double-check we do currently), but this would be very costly to the verifier, and wasteful because of the double computation. To make the verification easier, the author also sends the "middle" point u=f(x, p/2), which the author computed anyway on the way to y. (let's assume for a moment that "p" is even (although it isn't), so we can halve it without additional trouble). The verifier can now verify f(x, p/2)==u and f(u, p/2)==y, and this is equivalent to verifying f(x, p, m)==y; but still no reduction in work for the verifier. Because the two verification "branches" above share the same exponent p/2, they can be combined and verified together. Choosing a randon factor "r1" (let's say "r1" is a 64-bit number), the verifier computes the new seed x1= x^r1 * u, and verifies that: f(x1, p/2)==y1, where y1=u^r1 * y. So we transformed the verification f(x,p)=y into f(x1,p/2)=y1, halving the work. This "halving" is possible because the author sent to the verifier the value "u" (the middle value between x and y) -- so "u" is part of the proof. But if we have a way to halve the work of the verifier, why stop after only one halving step? let's look at the next halving: the author makes available "u1" the middle point between x1 and y1, u1=f(x1, p/4). r2 is another random 64-bit number chosen by the verifier, the verifier computes: x2=x1^r2 * u1 y2=u1^r2 * y1 and verifies f(x2, p/4)=y2 with only a quarter of the work of the author. ...
Seems like a good idea. If this technique was used as the verification progress on the server, my understanding is that the bogus PRP result submitted to the server can be denied completely using this check progress.
But what about other possible problems, for example, programming bugs?
My understanding is that double checking using a different shift can not only reveal bogus results, but possible programming bugs or other tricky problems. The final match means that it's nearly impossible for the result to have problems, so verified.
The Gerbicz Error Checking is really a powerful checking technique, but some problems, for example, programming bugs, can still make a bad result "passed" the GEC, so double check is still needed.
However, what about implementing this checking technique locally (i.e. on users computer) together with GEC? Is this helpful with excluding possible undetected errors? I mean, can this catch any errors like what double check using different shift does exactly? Or there are still possibility that programming bugs or other problems made a bad result passed this check and GEC simultaneously? (I don't know details about what kind of problems caused bad PRP results in history, and doing this check is somewhat similar to doing just a single GEC if set r1=r2=...=1, so I guess can't)

Last fiddled with by Fan Ming on 2020-02-04 at 10:08

2020-05-27, 00:34   #15
preda

"Mihai Preda"
Apr 2015

3·421 Posts
PRP-proof

Quote:
 Originally Posted by preda [to be continued]
I had some progress (finally) with the implementation of PRP-Proof in gpuowl. I would like to add some detail about the implementation, ask for feedback, and discuss how it fits in the grander scheme going forward.

Given the exponent E of mersenne candidate 2^E - 1,
to work around the problem of E not being a power of 2, we choose a convenient [multiple of] a power of two that is just under E that we call "topK" -- let's take topK as the largest multiple of 1024 that is smaller than E.

The proof is going to cover only the PRP iteration range from beginning to topK. The verifier will proceed to first verify the proof up to topK, after which will re-do the remaining number of iterations from topK to E (which is small because the distance from topK to E is less than 1024).

Let's call the starting value ("residue") of the PRP test "A", in our case A==3.

During the progress of the PRP test, we save the residue (meaning full data residue, not just the 64bits of res64) at the iteration topK. This residue will be part of the proof, let's call it "B" as it represents the final point of PRP iterations covered by the proof.

What we want to prove is that a number of topK iterations (squarings) applied to A produces B. (i.e. A^(2^topK) == B )

As a notation let's use prp(C, n) to denote the result of applying n PRP iterations (squarings) to C. Thus we want to prove that
prp(A, topK)==B.

To facillitate the proof we also save the residue at iteration topK/2, let's call it M (from middle). (M has the properties prp(A, topK/2)==M and prp(M, topK/2)==B)

At this point the proof contains {E, M, B}, and the prover can verify the proof in topK/2 iterations by choosing any random value h and verifying that
prp(A^h * M, topK/2) == M^h * B.

So adding M to the proof halved the verification effort. We recursively apply the same trick a number of times, every addional step halving the verification effort. For example the next step consists in adding to the proof the point M1 that is halfway on the topK/2 iterations between A^h *M and M^h *B, i.e. M1=prp(A^h *M, topK/4).

The first middle, M (that we're going to call M0 now) was encountered during the normal progress of original the PRP test (thus, to obtain M0 we simply save that residue). OTOH M1 is not part of the original PRP iterations, nevertheless it can be efficiently computed from two residues (the ones at iterations topK/4 and 3*topK/4) of the original PRP test.

We can choose how far to go with this sequence of "middles", M0, M1, ... . I'm going to call the number of middles that are part of the proof the "power" of the proof. When power==1 (the smallest proof), we store in the proof in addition to B only M0, and the verification effort is topK/2 iterations. When power==2, the size of the proof increases by one residue (adding M1), and the verification effort is halved to topK/4 iterations.

Let's look next at effort for computing the proof.

Last fiddled with by preda on 2020-05-27 at 00:36

 2020-05-27, 01:29 #16 preda     "Mihai Preda" Apr 2015 3·421 Posts cost of computing the PRP-proof The original tester (that runs the PRP test) is also the one computing (generating) the PRP-proof. The cost is two-fold: a storage (disk space) cost, and a computation cost. 1. Space Cost For M0, we need to save the residue at iteration topK/2 (one residue). For M1, we need to save the residues at topK/4 and 3*topK/4 (two residues). And so on, for M[i] we need to save 2^i residues when going through the initial PRP test. So for a proof of power N we need to save in total 2^N residues. (for a 100M exponent, a residue is roughly 12MB in size, so for a proof of power==9 we need about 6GB of disk space for residue storage during the progress of the PRP test). Once the proof has been computed (and presumably verified to make sure it was computed correctly) this storage can be released. The size of the proof itself (of power N) is just N+1 residues, so for power==9 the proof size is 120MB (much smaller than the transient space requirements of 6GB during the PRP test). 2. Computation Cost In order to be a proof, it must be non-fakeable (i.e. it must be impossible to "prove" something false), even when a dishonest author purposely attempts to fake it. This resistance is achieved through the use of the random value "h" when checking that prp(A^h * M, topK/2) == M^h * B the random value "h" being chosen by the verifier. But the particular choice of "h" above impacts the computation of the follow-up middles (M1, M2, etc) which must be computed by the proof author. This conundrum is solved by using the Fiat-Shamir heuristic https://en.wikipedia.org/wiki/Fiat%E...amir_heuristic The FS-heuristic allows the proof author to know the value of "h" (i.e. removes the liberty of the verifier to choose any random value he'd like), as long as the proof author has absolutely no liberty in choosing or affecting the value of h in the proof context. (if the proof author could choose the value of "h", he could fake the proof). This is achieved by setting "h" from a secure hash of the proof context: h = hash(E, topK, B, M0). Similarly to how we have a sequence of middles M0, M1.., we also have a sequence of such exponents "h", we're going to call them h0, h1..: contextHash = hash(E, topK, B) h0 = hash(contextHash, M0) h1 = hash(h0, M1) h2 = hash(h1, M2) etc. For computing the middle M[i], we use a set of saved 2^i residues, raised to powers computed from products of h[0] ... h[i-1]. (thus, M0 is always stored in the proof "unadultered" by any hash). Example: M0 = residue[topK/2] M1 = residue[topK/4]^h0 * residue[3*topK/4]. M2 = r[topK/8]^(h1*h0) * r[3*topK/8]^h0 * r[5*topK/8]^h1 * r[7*topK/8] etc. The bulk of the proof computation cost is in these exponentiations residue^product(h). It grows a bit faster than 2^proofPower. I estimate the computation cost for a proof of power==9 to be the equivalent of 200K iterations, which for a 100M exponents corresponds to the proof computation cost being 0.2% of the PRP test itself (so, low enough). Next I'll look into the cost of proof verification. Last fiddled with by preda on 2020-05-27 at 01:34
 2020-05-27, 01:58 #17 preda     "Mihai Preda" Apr 2015 3·421 Posts Proof verification cost The verifier obtains the proof consisting of: {E, topK, B, M[0 .. N-1]} (i.e.: the exponent, the upper bound of the proof topK, "B" the final residue at iteration topK, and the vector of middles containing N middles for a proof of power N) The verifier sets: A0 = 3, B0 = B rootHash = hash(E, topK, B) And iteratively computes: h0 = hash(rootH, M0) A1 = A0^h0 * M0, B1 = M0^h0 * B0 h1 = hash(h0, M1) A2 = A1^h1 * M1, B2 = M1^h1 * B1 And continues thus for the full set of middles, obtaining A[N-1] and B[N-1]. At this point the verifier must run topK/(2^N) PRP iterations to verify that: prp(A[N-1], topK/(2^N)) == B[N-1] If this equality holds, the proof is valid, otherwise it isn't. So the bulk of the proof (of power N) verification cost is topK/(2^N) PRP iterations. For power==9 and a 100M exponent this comes to about 200K iterations, 0.2% of the cost of the PRP test. Last fiddled with by preda on 2020-05-27 at 02:01
2020-05-27, 09:25   #18
preda

"Mihai Preda"
Apr 2015

23578 Posts
Proof file format

We'd like the proof that is generated by one piece of software to be verifiable by a different software -- having multiple software capable of verifying proofs reduces the chances of uniform bugs (affecting both the proof generator and the verifier in the same way) or the possibility of intentional "cheating" by the programmer.

For this we need to agree on a format for the proof file to allow interoperable reading/writing of the proof. We also need to agree on the cryptographic hash that is used, as both sides (proof author and verifier) need to see the same hash values.

1. File format
As a starting point proof save/load is already implemented in gpuowl, I'm going to describe the format used. (Of course the goal is to reach agreement on a format that is acceptable to multiple software authors.)

The proof file starts with one line containing textual (ASCII) information with a single '\n' line terminator. This line is called the header, and contains in order:
"PROOF" : a string identifying the file format
1: the version of the proof file format
exponent
topK
power
finalHash, a 64-bit hash value (discussed below)

All numbers being in decimal except finalHash which is in hexa.

Quote:
 PROOF 1 2976221 2975744 8 c7b648a462da0b28
Right after the header we have a set of residues in binary.
A residue is a sequence of 32-bit little-endian words, with the least significant word first. As each residue contains exactly E (E being the exponent) bits, the very last 32-bit word of the residue is only partially filled -- the extra bits (beyond E) are 0.

(so the length in bytes of a residue in this format is ((E-1)/32 + 1)*4 )

Right after the header we have one residue representing B (the residue at iteration topK), followed by N residues representing the middles (N being the proof power) in order starting with M0.

And that's all file-format wise; I think it's as simple as it gets. I'm looking for feedback from fellow developers whether this format looks acceptable, or what changes are proposed.

The finalHash is a value which allows to check both the file for integrity, and the to verify the correct computation of the hashes, to be discussed next.

Last fiddled with by preda on 2020-05-27 at 09:31

2020-05-27, 14:10   #19
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

19·241 Posts

Quote:
 Originally Posted by preda I'm looking for feedback from fellow developers whether this format looks acceptable, or what changes are proposed.
For certain uses, making res64 very accessible without launching either the originating program, or a special program to produce res64 from the contents, is valuable. I have a situation now where that would be useful for the cudalucas file format.

2020-05-28, 00:00   #20
preda

"Mihai Preda"
Apr 2015

3·421 Posts

Quote:
 Originally Posted by kriesel For certain uses, making res64 very accessible without launching either the originating program, or a special program to produce res64 from the contents, is valuable. I have a situation now where that would be useful for the cudalucas file format.
Yes that sounds good. (also useful as a check during verification)

 2020-05-28, 01:43 #21 preda     "Mihai Preda" Apr 2015 23578 Posts cryptographic hash We need a cryptographic hash function for choosing the exponents "h" in prp(A^h * M , k) == M^h * B After a bit of consideration I chose Blake2 (the Blake2b variant) which in addition to being fast, secure is also very simple to implement. Concerning the size of the exponents "h" in the proof, I propose we use 64bits. (The larger the hash-size the more secure is the proof, but also the effort to generate the proof increases proportionally. The proof verification OTOH is not affected much by the hash size). How to compute the hashes: The simplest scheme I came up with is: have the function hash(bytes) which computes a 64-bit Blake2b hash of the bytes. 1. start by computing the "rootHash": rootHash = hash(E, topK, B) which covers pretty much everything except the middles. What it does *not* cover, in addition to the middles, is: the proof power, the finalHash value (a check), the res64 value (redundant). When hashing we need to turn E and topK to bytes -- in my implementation I represent E and topK on 4bytes (32bits) little-endian (which does limit the exponent to 32bits). 2. For every step, update the hash by adding the middle in: h0 = hash(rootHash, M0) A1 = A0^h0 * M0 B1 = M0^h0 * B0 h1 = hash(h0, M1) etc. I realize code may be clearer: Code: u64 h = Blake2::hash({E, topK, B}); for (u32 i = 0; i < power; ++i) { Words& M = middles[i]; h = Blake2::hash({h, M}); A = gpu->expMul(A, h, M); B = gpu->expMul(M, h, B); } // check h == finalHash The final value of the hash at the end of the above loop should be compared for equality with the finalHash from the proof file header -- their equality indicates both file integrity and correct implementation of the hash() -- that would be a check that is done before the expensive part of proof verification consisting in doing the large number of PRP iterations. (to prevent paying the verification time just to reject a corrupted file, or just to detect wrong hash application).
 2020-05-28, 02:36 #22 preda     "Mihai Preda" Apr 2015 23578 Posts Verification The proof file contains B, the presumed residue at prp iteration topK. Verifying the proof proves that B is correct, i.e. that prp(3, topK) == B. After this, the verifier can proceed to run the iterations from topK up to E, which is tiny (less than 1024 iterations). This way the verifier finds out the res64 type1, type4, whether the PRP test outcome is probable-prime or composite, etc. As a convenience, the expected type-1 res64 is included in the proof header as well, and by comparing with it the verifier gains confidence in this final small number (<1024) of iterations (let's call these final iterations on top of B "the tail"). (Of course the verifier can also double-check the tail, although this won't be needed in practice because the whole proof verification will be double-checked anyway).

 Similar Threads Thread Thread Starter Forum Replies Last Post rula Homework Help 3 2017-01-18 01:41 ixfd64 PrimeNet 7 2008-10-20 20:45 JHagerson Forum Feedback 1 2006-05-13 21:30 vaughan ElevenSmooth 5 2005-09-08 17:17 ltd Prime Sierpinski Project 10 2005-08-08 13:38

All times are UTC. The time now is 04:25.

Thu Oct 22 04:25:33 UTC 2020 up 42 days, 1:36, 0 users, load averages: 1.48, 1.72, 1.68