mersenneforum.org  

Go Back   mersenneforum.org > Math Stuff > Computer Science & Computational Number Theory

Reply
 
Thread Tools
Old 2020-01-25, 20:52   #78
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

112·13 Posts
Default

Quote:
Originally Posted by ewmayer View Post
@Jeppe: This is simply the basic Pépin residue ... (And also one wants the 2nd double-check run at the slightly larger FFT length to finish and confirm the Pépin-test results.)
Have you checked that:
(3^(2^(2^m)) mod F_n) mod q = 3^(2^(2^m)) mod q
is true for the stored interim residues?

Since you have 3^(2^(2^m)) mod F_n then to get the left side is trivial and you can get quickly the right side, becuase q is small , the check is cheap. Though it is not that very-very strong check, but the two known factors could find a mismatch with probability = 1-(1e-10) which is still quite impressive. Note that we don't get the strength of 1/(q-1) or 1/znorder(Mod(3,q)) because the order is divisible by a "large" power of two, and that will be lost in the iterated squarings of a wrong residue.

Last fiddled with by R. Gerbicz on 2020-01-25 at 20:54 Reason: small typo
R. Gerbicz is offline   Reply With Quote
Old 2020-01-27, 21:41   #79
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

267318 Posts
Default

Quote:
Originally Posted by R. Gerbicz View Post
Have you checked that:
(3^(2^(2^m)) mod F_n) mod q = 3^(2^(2^m)) mod q
is true for the stored interim residues?

Since you have 3^(2^(2^m)) mod F_n then to get the left side is trivial and you can get quickly the right side, becuase q is small , the check is cheap. Though it is not that very-very strong check, but the two known factors could find a mismatch with probability = 1-(1e-10) which is still quite impressive. Note that we don't get the strength of 1/(q-1) or 1/znorder(Mod(3,q)) because the order is divisible by a "large" power of two, and that will be lost in the iterated squarings of a wrong residue.
ITYM (3^(2^(2^m - 1)) mod Fm) mod q = 3^(2^(2^m - 1)) mod q, since the Pépin test is just an Euler-pseudoprime test (which happens to prove rigorous in the special case of the Fm), i.e. computes 3^(2^((Fm - 1)/2)) mod Fm = 3^(2^((2^m)/2)) mod Fm. But in any event, that's a good suggestion - fiddled my code to load the final Pépin-test residue R from the savefile and for the 2 known small prime factors p=640126220763137 and q=1095981164658689 computed

R == 367500396407933 (mod p) and
R == 95971534699540 (mod q).

I then separately computed 3^(2^(2^m - 1)) mod p and q via (2^m-1) = 1073741823 iterated squarings modulo p and q, separately, and the results match, which anyone can confirm on even quite modest hardware. So we have quite good confidence in the result, on top of the 2 runs agreeing through ~740 Miter (as of the latest 64M-FFT run update from Ryan), and said runs being done on high-end server ssytems with ECC memory.

For F31 I will also be using the now-famous periodic whole-residue check named in honor of the above-quoted forum member. :)

Last fiddled with by ewmayer on 2020-01-27 at 21:43
ewmayer is offline   Reply With Quote
Old 2020-01-28, 08:49   #80
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

110001001012 Posts
Default

Quote:
Originally Posted by ewmayer View Post
I then separately computed 3^(2^(2^m - 1)) mod p and q via (2^m-1) = 1073741823 iterated squarings modulo p and q, separately, and the results match, which anyone can confirm on even quite modest hardware.
Trivial note: it is faster to reduce the exponent mod (p-1) [use Fermat's little theorem].
R. Gerbicz is offline   Reply With Quote
Old 2021-07-02, 18:57   #81
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

112·97 Posts
Default Fermat-modmul with circularly shifted residues

I spent an appreciable amount of time last year wrestling with basic concepts and implementation-in-code of Fermat-mod squaring chains in the presence of residue shift. This proves rather more interesting (a polite euphemism for "pain in the butt") than for the Mersenne-mod case. The Gerbicz check renders it less important, but it may still prove useful in some contexts, so figured I'd post a summary.

Let s0 = initial shift, i.e. instead of starting the Pépin test of Fm with initial seed x0, we start with (x0 << s0) (mod Fm).

[1] Data layout: for Mersenne-mod IBDWT, we have 2 different wordsizes differing by one bit, so need to loop over the elements in our variable-wordsize array until we find the one containing the (s0)th bit, then shift x0 by the appropriate small number of bits to place it on the proper bit boundary within that target word. For Fermat-mod, the easy case is that of power-of-2 FFT length, for which we have a fixed power-of-2 wordsize. More generally we will have a non-power-of-2 FFT length optimized for the target modulus - e.g. for F24 we can use 896 Kdoubles, and for F30 I did one run at 60M and the other (now 93% complete) at 64M. In both cases we use a special acyclic-transform-effecting DWT which multiplies our N input words by the first N (2N)th complex roots of unity, i.e. the (j)th input gets multiplied by DWT weight w[j] = exp(I*Pi*j/N). This means multiplying our original real input word by a complex number, but we notice that w[j+N/2] = I*w[j], making it natural to group the (j)th and (j+N/2)th input words together in memory and treat them as a single input to a complex FFT. This is the so-called "right-angle transform" trick, and is detailed in the original Crandall/Fagin 1994 DWT paper. In the Mersenne-mod IBDWT case we can also treat adjacent even/odd-indexed imput elements as a single input to a complex FFT, but as the Mersenne-mod IBDWT weights are strictly real, these are simply pairs of adjacent input words.

For Fermat-mod arithmetic using a non-power-of-2-length transform we need to layer a Mersenne-style dual-wordlength base representation and similarly defined IBDWT weights atop our acyclic-effecting DWT. One simplification relative to the Mersenne case is that for FFT length [odd]*2^k, the Fermat-mod wordsizes and IBDWT weights recur in repeating length-[odd] patterns.

In the context of a right-angle Fermat-mod acyclic transform, finding the target word for a given initial-seed shift means looping first over the even-indexed elements of our residue vector (considered here as a length-N real array), then if the accumulated wordsizes are still less than the target shift s0, looping over the odd-indexed elements until we reach the target word. For the Fermat-mod case we need only do this for the initial seed of the Pépin test; for the Mersenne-mod case, specifically the Lucas-Lehmer test, we must also repeat the exercise on each subsequent iteration in order to find the proper bit offset for injection of the -2 term added to the outout of each mod-squaring output. This can be done very efficiently - specifically in O(1) time - by initializing a suitable bitmap encoding the variable wordsizes of the Mersenne-mod IBDWT and using that to effect a fast lookup scheme.

[2] For a basic squaring-chain-with-shift scheme as we do for Mersenne number M(p), each mod-squaring doubles the shift count (mod p). That means that for any s0 != 0 (mod p) the shift will remain nonzero and nonrepeating during the LL/PRP-test, since at squaring i, our shift s = s0.2^i (mod p) and the multiplicative group (mod p) is of maximal order p-1.

OTOH for squaring chains modulo a Fermat number Fm, each subsequent squaring doubles the shift count (mod 2^m), so we have the problem that any nonzero initial shift s0 will lead to a shift s = 0 after precisely m square-mods, and the shift will remain 0 throughout the remainder of the squaring chain. One way to get around this problem is to define an auxiliary pseudorandom-bit array, and on the (i)th square-mod, if bit[i] = 1, do a further mod-doubling of the residue, thus yielding the shift-count update s[i] = 2.s[i-1] + bit[i]. For a pair of Pépin tests of Fm such as are typically done to allow for real-time cross-validation of interim residues, it is also advisable to seed the bit-arrays differently for each run.

[3] For the random-bit-offset scheme in [2], we further need a postprocessing step to determine the correct sign to apply to the final residue; this relates to the shift-doubling modulo the binary exponent which occurs on each square-mod. For Mp, squaring a shifted residue R.2^s gives R^2.2^2s (mod Mp) and 2^2s == 2^(2s-p) (mod Mp), so at each iteration we simply double the shift count (mod p). For arithmetic modulo Fm, by way of contrast, each time the doubled shift count incurs a reduction (mod 2^m) the shifted residue undergoes a change of sign: R^2.2^2s (mod Fm) == -R^2.2^(2s-2^m) (mod Fm), so a shifted-residue scheme must keep track of both the shift count and the sign of the shifted residue, relative to the true unshifted one.

However: since - unlike LL - this PRP test involves just repeated mod-squaring, we don't care if the sign of the residue on any particular intermediate iteration is flipped, since the subsequent mod-squaring will flip it back to +. The only thing we care about is whether the sign of the final residue (either for the Pépin test or for a particular checkpoint subinterval thereof) has flipped w.r.to that of its immediate predecessor, the penultimate residue, since such a final-squaring sign-flip has no ensuing mod-squaring to autocorrect it. Thus, if there is a final-squaring sign flip we need to unflip the sign of the resulting residue manually. But that is not all, since generating a shift-removed final residue also involves a modular multiply or divide by a power of 2. I've always found it conceptually easier to multiply-mod rather to divide-mod, so for both that and less-code-to-maintain reasons, my mi64 multiword int library has low-level code only for leftward circular shift, and effects s-bit rightward circular shift of a b-bit number by a (b-s)-bit leftward one. That is all that is needed working modulo a Mersenne number, as these are of form 2^p-1 and thus any bits off-shifted to the left get rewrapped from the right with + sign. Once again, Fermat moduli are trickier. Modulo Fm, a leftward circular shift must feed the bits off-shifted to the left back into the right end with a - sign, and the aforementioned shrc-done-as-shlc substitution implies a negation (mod Fm) of the resulting shift-removed residue.

Thus, if at end of our current checkpoint interval or Pépin test, our residue R needed a sign-flip due the doubled-shift-count needing a reduction (mod 2^m), our shrc-done-as-shlc already took care of it. If not, the residue needs an explicit negation. Rather than doing an explicit Fm - R with the attendant digitwise negations and borrows, one can simply do a bitwise-complement of the residue vector and further += 2 of limb 0 in order to recover the unshifted residue.

Last fiddled with by ewmayer on 2021-08-04 at 00:42 Reason: ficks speelink
ewmayer is offline   Reply With Quote
Old 2022-05-13, 21:54   #82
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

112·97 Posts
Default F25-F30 cofactor status, Part 1

[I PMed a more-or-less-identical draft of this to Prime95, ATH; yorix, jasonp, philmoore and R. Gerbicz yesterday]

F25-F30 cofactor status:

In the context of testing the cofactor-PRP functionality which will be supported in Mlucas v21, I've at long last done Suyama-style PRP tests on the cofactors of F25-F30, using the Pépin-test residues I've generated for same over the past 10 years, as detailed in this thread - original primality-tests for F25-26 detailed in post #1 (recently updated with latest results), F27 in #2 and #5, F28 in #4 and #20, F29 in #29,#42,#52. For F30, the interim-residue files and logfile for the completed run @60 M FFT have been uploaded but the 99.5%-complete double-check run @FFT 64M is stalled due to my KNL having gone offline last week - I suspect the cheapie PSU installed by the vendor of this barebones system, but not yet had time to get under the hood, so to speak.

The cofactor-PRP results are tabulated in a "Part 2" followup to this note. My code only supports the Suyama cofactor-PRP method, which starts from the basic Pépin-test residue, does one mod-squaring to generate the Fermat-PRP residue A from that and an additional lg2(F) mod-squarings to generate the subtrahend B for the Suyama test, where F is the product of known prime factors; it makes little sense to spend roughly the same amount on a direct PRP test of such cofactors as needed by the Pépin test of F_m, only to have to re-do a similar amount of work when a new factor is discovered.

I first added some basic cofactor-PRP code to Mlucas in early 2018, and used it to test the cofactors of F25-29 using the Pépin-test residues I had generated via my primaiity tests for same. Did not post the cofactor-PRP results at the time because the accompanying Res64 values mismatched ones posted by Andreas Höglund (a.k.a. ATH), who had done cofactor-PRP tests of F25 and F26 using George's code. At the time I was trying to figure out why said result mismatched Andreas' value for what I thought was the same quantity, and in the ensuing PM exchange it did emerge that George's code-at-the-time (2009) which Andreas used in fact was doing a direct-PRP-test of the cofactor, but I was unaware of the cofactor-PRP runs for F25 and F26 done by forumite Yar (a.k.a. yorix - details below) in late 2017 using the then-latest Prime95 version. Having more pressing concerns at the time I decided to put that aside and revisit later. Mlucas v21 will have 2 major feature adds, PRP-CF and PRP-cert support, so in the context of the first, 'later' is now.

In 2009-2010 Andreas Höglund used George's code to test the cofactors of F25-27; cf. Posts #51, #62, #64 in this thread:
Quote:
UID: athath, F25/known_factors is not prime. RES64: 44BFC8D231602007. Wd1: B9307E03,00000000
Known factors used for PRP test were: 25991531462657,204393464266227713,2170072644496392193

UID: athath, F26/76861124116481 is not prime. RES64: 6C433D4E3CC9522E.

UID: athath, F27/151413703311361/231292694251438081 is not prime. RES64: 481F26965DE16117.
Those posts were not clear on precisely what *type* of cofactor-PRP test was run - direct Fermat-PRP test on the cofactor C, or a Pépin-test (R = 3^((N-1)/2) (mod N)) of the Fermat number N = F*C followed by the Suyama postprocessing step (cf. the post by Phil Moore here), where one computes first the Fermat-PRP residue A = 3^(N-1) (mod N) = R^2 (mod N) via a single mod-squaring of the Pépin-test residue R, then uses the product of known factors F to compute B = 3^(F-1) (mod N), and checks if the difference (A - B) is divisible by the cofactor C. The Suyama version is preferable because it starts with a basic Pépin-test of the N, and as new factors are found that same residue can be used to quickly (#of squarings = bits in product-of-known-prime-factors) check each resulting cofactor for PRP-ness.

In late 2017 user Yar (don't know last name, uid = yorix) posted a thread about his own cofactor-PRP runs for F25-26, again using George's code, but now mentioning 2 different run types for each, which clarifies the types of cofactor-PRP tests used:
Quote:
PRP test for F25 with known factors: PRP=N/A,1,2,33554432,1,"25991531462657,204393464266227713,2170072644496392193" gives 'composite' with res64: 7B6B087B84A45562

PRP test for F26 with known factors: PRP=N/A,1,2,67108864,1,"76861124116481" gives 'composite' with res64: FBB406B3A281838C
We see that Res64 differs from ATH's above - Yar notes "This test with base 3 and returns the same residue independent of the number of known factors", which indicates a Suyama-style cofactor-PRP test.

For F25 and F26 my Fermat-PRP Res64 values match Yar's for the Suyama-style cofactor check, and his direct-cofactor results match ATH's (though that is a same-software-used match). But it would be nice if someone using George's code could confirm not just the Fermat-PRP residues (A) for the above, but for purposes of pronouncing the ensuing cofactor-PRP results "cross-verified using independent codes", we should also cross-check the (A - B) mod C ones, where C is the cofactor. The Fermat-PRP computation of course constitutes the overwhelming bulk of a PRP-CF run, but it is important to also have assurance that the B-residue and (A - B) mod C have been correctly computed.

[
@George, would it be possible for you to tweak your code to print the Res64 checksum for the latter quantity? If Yar still has the Fermat-PRP-test residues from his runs of F25 and F26 using your code, it would then be a simple matter to rerun the Suyama step from those and cross-check the (A - B) mod C residues. If not, the F25 and F26 full-length tests are now easily runnable on modest hardware.

Also, is there a different Prime95/mprime worktodo-entry syntax for a direct-PRP test of a cofactor vs a 2-step approach (Fermat-PRP of N = F*C followed by Suyama step, or did you switch from the former to the latter along the way? Yar only echoes the workfile entry for the 2-step run, but notes that used 29.4 and then in a later post notes retrying with v29.3 and getting results matching Andreas' earlier ones.
]

Lastly, I did not see any post by Yar regarding a Suyama-style cofactor-PRP test for F27 using Prime95. It would be nice for someone to do such so we can put that one to bed, as well, at least until another factor is discovered. I could obviously do it myself, but I think the "different persons using different codes" optics would be better. Compute cost would be 20-30% more than a current GIMPS wavefront-PRP test.

Results for my Mlucas v21-prototype cofactor-PRP runs follow in Part 2.
ewmayer is offline   Reply With Quote
Old 2022-05-13, 22:05   #83
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

112·97 Posts
Default F25-F30 cofactor status, Part 2

In each case I begin with a quick Pépin-test residue sanity check, summarized here:
Quote:
For PRP-tests of N with known-factors (e.g. prelude to a cofactor-PRP postprocessing step),
R. Gerbicz [post #79 of https://mersenneforum.org/showthread...=18748&page=8] suggests, using
the specific case of the (m)th Fermat number F_m to illustrate:
Quote:
Have you checked that:
(3^(2^(2^m)) mod F_m) mod q = 3^(2^(2^m)) mod q
is true for the stored interim residues?

Since you have 3^(2^(2^m)) mod F_n then to get the left side is trivial and you can get quickly
the right side, becuase q is small , the check is cheap. Though it is not that very-very strong
check, but the two known factors could find a mismatch with probability = 1-(1e-10) which is
still quite impressive. Note that we don't get the strength of 1/(q-1) or 1/znorder(Mod(3,q))
because the order is divisible by a "large" power of two, and that will be lost in the iterated
squarings of a wrong residue.
(For Pépin-test residue, we'd actually check (3^(2^(2^m - 1)) mod F_m) mod q = 3^(2^(2^m - 1)) mod q.)
More generally, for the powermod residue R = a^pow (mod N) where N has prime factor q, we check whether

a^pow (mod N) = a^pow (mod q).

If pow > q, the RHS can be simplified using Fermat's little theorem, a^(q-1) == 1 (mod q), thus
a^pow == a^pow' (mod q), where pow' = pow (mod q-1).

Example: For the final Pépin-test residue R for F30 (N = 2^2^30 + 1, pow = (N-1)/2 = 2^(2^30 - 1)), with
2 known small prime factors p = 640126220763137 and q = 1095981164658689, we get

p: pow' = 2^(2^30 - 1) (mod p-1) = 433448099512320, R == 367500396407933 (mod p)
q: pow' = 2^(2^30 - 1) (mod q-1) = 96576634617856, R == 95971534699540 (mod q).
In the results below, for each prime factor, 'A: R == ' is the full residue (mod p) and 'B: R ==' is the (mod p) squaring-chain checksum computed as above.

For all the key quantities I give the "enhanced Selfridge-Hurwitz residue" triplet, which consists of residues mod 2^64 (hex), 2^35-1 and 2^36-1, the latter two printed in decimal form. Lastly, I computed GCD(A - B,C) to check whether each cofactor is possibly a prime power, as described in Phil Moore's Part-1-linked post [or cf. Crandall & Pomerance (2nd ed.), exercise 4.8]. Unsurprisingly, all cofactors come up "composite", and even more unsurpisingly, none is a prime power:
Code:
F25: (I also re-ran @2048K FFT and confirmed the same values):
Computing 33554431-squaring residue R (mod known prime q = 25991531462657)
	A: R == 5785371429337 (mod q)
	B: R == 5785371429337 (mod q)
Computing 33554431-squaring residue R (mod known prime q = 204393464266227713)
	A: R == 45685916416586703 (mod q)
	B: R == 45685916416586703 (mod q)
Computing 33554431-squaring residue R (mod known prime q = 2170072644496392193)
	A: R == 991358595838356746 (mod q)
	B: R == 991358595838356746 (mod q)
F25: using FFT length 1792K = 1835008 8-byte floats.
 this gives an average   18.285714285714285 bits per digit
Doing one mod-F25 squaring of iteration-33554431 residue [Res64 = CB913B6E3B1FFE2A] to get Fermat-PRP residue
Fermat-PRP residue (A)     = 0x7B6B087B84A45562,26994100025,66886679148
Processed 162 bits in binary modpow; MaxErr = 0.203125000
3^(F-1) residue (B)        = 0x2909830729BFA110,30348546187,21153567739
(A - B) Res64 = 0x526185745AE4B453, C Res64 = 0x21CC1C03F0000001
(A - B)/C: Quotient = 10040206794592865905247433568023540557761720874501, Remainder Res64 = 0x7A551893ACF4BE4E
Suyama Cofactor-PRP test of F25 / 25991531462657 / 204393464266227713 / 2170072644496392193 with FFT length 1835008 = 1792 K:
	(A - B) mod C has Res64,35m1,36m1: 0x7A551893ACF4BE4E,34178045549,65496009072.
This cofactor is COMPOSITE [C10100842]. Checking prime-power-ness via GCD(A - B,C) ...
GCD((A - B)[33554304 bits], C[33554270 bits]) = 1
Time for GCD = 00:00:10.857
Cofactor is not a prime power.

F26: (I also re-ran @4096K FFT and confirmed the same values):
Computing 67108863-squaring residue R (mod known prime q = 76861124116481)
	A: R == 10772922302714 (mod q)
	B: R == 10772922302714 (mod q)
F26: using FFT length 3584K = 3670016 8-byte floats.
 this gives an average   18.285714285714285 bits per digit
Doing one mod-F26 squaring of iteration-67108863 residue [Res64 = 2B58BDE2A0C14045] to get Fermat-PRP residue
Fermat-PRP residue (A)     = 0xFBB406B3A281838C, 2193939191, 7371940776
Processed 46 bits in binary modpow; MaxErr = 0.187500000
3^(F-1) residue (B)        = 0x25F5AB0FFC728C87,32173443562,20444894301
(A - B) Res64 = 0xD5BE5BA3A60EF705, C Res64 = 0x23FFBA1860000001
(A - B)/C: Quotient = 49561426973911, Remainder Res64 = 0xB7BD14979ACF222E
Suyama Cofactor-PRP test of F26 / 76861124116481 with FFT length 3670016 = 3584 K:
	(A - B) mod C has Res64,35m1,36m1: 0xB7BD14979ACF222E, 6416457631,15466936163.
This cofactor is COMPOSITE [C20201768]. Checking prime-power-ness via GCD(A - B,C) ...
GCD((A - B)[67108864 bits], C[67108818 bits]) = 1
Time for GCD = 00:00:25.555
Cofactor is not a prime power.

F27: (I also re-ran @7.5M and 8M FFT and confirmed the same values):
Computing 134217727-squaring residue R (mod known prime q = 151413703311361)
	A: R == 36549260191848 (mod q)
	B: R == 36549260191848 (mod q)
Computing 134217727-squaring residue R (mod known prime q = 231292694251438081)
	A: R == 68921389296064753 (mod q)
	B: R == 68921389296064753 (mod q)
F27: using FFT length 7168K = 7340032 8-byte floats.
 this gives an average   18.285714285714285 bits per digit
Doing one mod-F27 squaring of iteration-134217727 residue [Res64 = 043A6C8B9272B297] to get Fermat-PRP residue
Fermat-PRP residue (A)     = 0xC3B191C45CCD7820,21887650168,55240256170
Processed 104 bits in binary modpow; MaxErr = 0.281250000
3^(F-1) residue (B)        = 0x485F292142F953EC,20036545122,26481669619
(A - B) Res64 = 0x7B5268A319D42434, C Res64 = 0xD8C9BECF60000001
(A - B)/C: Quotient = 13668745121613830518711853822333, Remainder Res64 = 0x79E89190C56372B7
Suyama Cofactor-PRP test of F27 / 151413703311361 / 231292694251438081 with FFT length 7340032 = 7168 K:
	(A - B) mod C has Res64,35m1,36m1: 0x79E89190C56372B7, 2608954171,29330680430.
This cofactor is COMPOSITE [C40403531]. Checking prime-power-ness via GCD(A - B,C) ...
GCD((A - B)[134217664 bits], C[134217624 bits]) = 1
Time for GCD = 00:00:59.492
Cofactor is not a prime power.

F28: (I also re-ran @16M FFT and confirmed the same values):
Computing 268435455-squaring residue R (mod known prime q = 1766730974551267606529)
	A: R == 688455961638747199546 (mod q)
	B: R == 688455961638747199546 (mod q)
F28: using FFT length 15360K = 15728640 8-byte floats.
 this gives an average   17.066666666666666 bits per digit
Doing one mod-F28 squaring of iteration-268435455 residue [Res64 = 468731C3E6F13E8F] to get Fermat-PRP residue
Fermat-PRP residue (A)     = 0x7CAE1275B362B2CA,20461105324,30894284803
Processed 70 bits in binary modpow; MaxErr = 0.296875000
3^(F-1) residue (B)        = 0xBD496177DB5155A0,33633822046, 3778945069
(A - B) Res64 = 0xBF64B0FDD8115D2B, C Res64 = 0x39AEB33000000001
(A - B)/C: Quotient = 585290509330851034797, Remainder Res64 = 0x790BE051D73D667E
Suyama Cofactor-PRP test of F28 / 1766730974551267606529 with FFT length 15728640 = 15360 K:
	(A - B) mod C has Res64,35m1,36m1: 0x790BE051D73D667E, 4159568998,55362539930.
This cofactor is COMPOSITE [C80807103]. Checking prime-power-ness via GCD(A - B,C) ...
GCD((A - B)[268435392 bits], C[268435386 bits]) = 1
Time for GCD = 00:02:16.895
Cofactor is not a prime power.

F29: (I also re-ran @32M FFT and confirmed the same values):
Computing 536870911-squaring residue R (mod known prime q = 2405286912458753)
	A: R == 2364493566098202 (mod q)
	B: R == 2364493566098202 (mod q)
read_ppm1_savefiles: Hit EOF in read of FFT-kblocks ... assuming a pre-v18 savefile.
F29: using FFT length 30720K = 31457280 8-byte floats.
 this gives an average   17.066666666666666 bits per digit
Doing one mod-F29 squaring of iteration-536870911 residue [Res64 = BECE1E5FFBE5EC12] to get Fermat-PRP residue
Fermat-PRP residue (A)     = 0x9BD416DB3918C152,32748692453,58920206666
Processed 51 bits in binary modpow; MaxErr = 0.375000000
3^(F-1) residue (B)        = 0x30AFC6010F62884B, 7473784009, 6544795868
(A - B) Res64 = 0x6B2450DA29B63908, C Res64 = 0x3FF7746780000001
(A - B)/C: Quotient = 1671099484324226, Remainder Res64 = 0xAB3DBCEFFE907786
Suyama Cofactor-PRP test of F29 / 2405286912458753 with FFT length 31457280 = 30720 K:
	(A - B) mod C has Res64,35m1,36m1: 0xAB3DBCEFFE907786,13349243207,41564307636.
This cofactor is COMPOSITE [C161614233]. Checking prime-power-ness via GCD(A - B,C) ...
GCD((A - B)[536870912 bits], C[536870861 bits]) = 1
Time for GCD = 00:05:09.784
Cofactor is not a prime power.

F30: (I also re-ran @64M FFT and confirmed the same values):
Computing 1073741823-squaring residue R (mod known prime q = 640126220763137)
	A: R == 367500396407933 (mod q)
	B: R == 367500396407933 (mod q)
Computing 1073741823-squaring residue R (mod known prime q = 1095981164658689)
	A: R == 95971534699540 (mod q)
	B: R == 95971534699540 (mod q)
F30: using FFT length 61440K = 62914560 8-byte floats.
 this gives an average   17.066666666666666 bits per digit
Doing one mod-F30 squaring of iteration-1073741823 residue [Res64 = A70C2A3DB98D6D9D] to get Fermat-PRP residue
Fermat-PRP residue (A)     = 0x1D7AA56A5428333C,30698378043,40327643421
Processed 99 bits in binary modpow; MaxErr = 0.406250000
3^(F-1) residue (B)        = 0x6F856680482D3F4E, 9072042323,48656226334
(A - B) Res64 = 0xADF53EEA0BFAF3EE, C Res64 = 0xFFF9D50500000001
(A - B)/C: Quotient = 26538003597554197569639536298, Remainder Res64 = 0xEDFCE7FC50271D44
Suyama Cofactor-PRP test of F30 / 640126220763137 / 1095981164658689 with FFT length 62914560 = 61440 K:
	(A - B) mod C has Res64,35m1,36m1: 0xEDFCE7FC50271D44, 9723857098,34401987830.
This cofactor is COMPOSITE [C323228467]. Checking prime-power-ness via GCD(A - B,C) ...
GCD((A - B)[1073741760 bits], C[1073741725 bits]) = 1
Time for GCD = 00:11:38.593
Cofactor is not a prime power.
ewmayer is offline   Reply With Quote
Old 2022-07-01, 22:35   #84
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

112×97 Posts
Default F30 primality test and matching double-check

[See posts #65, #70, #74 and #75 in this thread for previous notes re. the F30 Pépin tests.]

At looooooooooooong last my multiple-systems-hopping on-again/off-again DC run of the Pépin test on F30 (the first-completed run was @60K, DC run was @64M FFT has finished. A rough and ready précis of the systems used along the way:
  • 25 Mar - 02 Apr 2017: By way of exploration, started a Pépin test of F30 on my Intel Haswell quad, 4c4t, using 256-bit AVX2+FMA3 SIMD build of Mlucas 15.0. This ran at a throughput of ~100 ms/iter for a little over a week, completing 5.6M iterations (0.52% complete).

  • 02 Sep 2017 - 17 April 2019: Resumed in 64-threaded mode on a 64-core KNL server crowdfunded by GIMPS members and physically hosted by David Stanfill, using an AVX-512 build of Mlucas 17.0 (the just-released version supporting AVX-512.) That ran at a rate of 68 ms/iter, reaching iteration 734400000 (68.4% complete). Around that time the KNL went offline and David Stanfill went AWOL; I believe the 'net slang is "ghosted". [Aside: David had mentioned he was cryptocurrency "mining" in a major way; somehow that and such disappearing acts seem to go together, as one is seeing with the recent global crypto-ponzi collapse.]

  • 28 Nov 2020: Resumed from the 730M-iter checkpoint in 64-threaded mode on a 68-core|272-thread KNL workstation bought used for $500, using avx512 build of same Mlucas 17.1 codebase used for above for consistency, and running one thread on each of physical cores 0-63. This crunched steadily at a throughput of 64 ms/iter for 11 months, reaching 1014M-iter, a little over 94% complete.

  • 31 Oct 2021: Continued in 64c64t mode on the above KNL workstation, but now as a low-priority "backup" run to a 64c128t p-1 run (first to B1 = 10^7, then in 100M-prime-interval chunks to B2 = 400M) of F33, using the p-1 functionality added to Mlucas over the course of most of 2021. In backup-run mode throughput was ~300 ms/iter; this ran for ~6 months, reaching 1070M-iter, a little over 99.66% complete, until 4 May this year, when the KNL gve up the ghost.

  • 12 Jun 2022: After various attempts at fault-tracing and system-revival proved fruitless, I pulled the SSD out of the KNL and installed in my desktop open-frame GPU crunching rig, and with some assistance from Paul Underwood successfully mounted the KNL filestystem (the KNL ran CentOS, vs the desktop rig's Ubuntu, so some FS-naming-convention sleuthing was needed). The filesystem was thankfully intact, so I copied the F30-run logfile and latest checkpoint file over to my 2c4t Intel NUC8i3CYS mini and completed the F30 double-check run @64M FFT using AVX-512 build there. That ran at ~360 ms/iter and finished on 28 June; the Selfridge-Hurwitz residues match those of the earlier run @60M FFT:
Code:
[Jun 27 22:45:52] F30 Iter# = 1073690000 [100.00% complete] clocks = 01:00:38.498 [  0.3638 sec/iter] Res64: C52F182238099986. AvgMaxErr = 0.018915548. MaxErr = 0.023437500.
[Jun 27 23:46:36] F30 Iter# = 1073700000 [100.00% complete] clocks = 01:00:37.609 [  0.3638 sec/iter] Res64: B7D9DAA9B4FC17D8. AvgMaxErr = 0.018934628. MaxErr = 0.023437500.
[Jun 28 00:47:22] F30 Iter# = 1073710000 [100.00% complete] clocks = 01:00:39.309 [  0.3639 sec/iter] Res64: D90E3A766F5D1B5C. AvgMaxErr = 0.018922704. MaxErr = 0.023437500.
[Jun 28 01:48:07] F30 Iter# = 1073720000 [100.00% complete] clocks = 01:00:38.582 [  0.3639 sec/iter] Res64: 2D8B46F0E8274E4C. AvgMaxErr = 0.018944431. MaxErr = 0.023437500.
[Jun 28 02:48:53] F30 Iter# = 1073730000 [100.00% complete] clocks = 01:00:39.394 [  0.3639 sec/iter] Res64: 1CFA1EBC9CEE1C12. AvgMaxErr = 0.018940419. MaxErr = 0.025390625.
[Jun 28 04:07:09] F30 Iter# = 1073740000 [100.00% complete] clocks = 01:18:08.842 [  0.4689 sec/iter] Res64: 005D2E424C47F6A8. AvgMaxErr = 0.018948376. MaxErr = 0.023437500.
[Jun 28 04:27:00] F30 Iter# = 1073741823 [100.00% complete] clocks = 00:19:42.173 [  0.6485 sec/iter] Res64: A70C2A3DB98D6D9D. AvgMaxErr = 0.018940398. MaxErr = 0.023437500.
F30 is not prime. Res64: A70C2A3DB98D6D9D. Program: E17.1
F30 mod 2^36     =          58947628445
F30 mod 2^35 - 1 =          26425548225
F30 mod 2^36 - 1 =          59810773698
So in total, a little over 5 calendar years from start to finish for this patchwork-quilt DC run. The complete-final-residue files also have matching MD5 checksums, and as described in the above posts re. cofactor status, for such cases with one or more known factors, we can also use those to quickly compute modular-squaring-chain checksums of bitness roughly on the order of the known factors.

I've posted links-to/descriptions-of the every-100Miter interim-checkpoint files, the full final residue file and the 2 run-logfiles (for the first run @60M FFT and the just-completed DC @64M) in the OP of this thread. I also have persistent full checkpoint files at 10Miter intervals, but as each such files is 2^30/8 = 128 Mbyte, and there are ceiling(2^30/(10 million)) = 108 such, thus the total archive is ~13GB. The residue files are more or less completely random byte data, i.e. cannot effectively be compressed. I have burned a copy of that archive to a thumb drive as a backup for the one on my HD, and will make it available via a snail-mailed 16GB microSD on request should anyone wish to do e.g. a parallel run validation via rerun of said 10Miter intervals.

I don't know with any certainty whether this constitutes the first gigabit primality-test-with-double-check; if anyone has heard of another such, please post a link.

For F31 and beyond - I have the Gerbicz check working for Fermat-mod arithmetic in the in-dev Mlucas v21 code; on top of that we now also have the PRP-proof mechanism being used for GIMPS work; that can also be used for Pépin-test work, as that inherently uses a long chain of successive mod-squarings. In effect, a certificate of primality, independently checkable in a small fraction of the time of the original primality test.

Last fiddled with by ewmayer on 2022-07-02 at 21:50
ewmayer is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
P-1/P+1 on Fermat numbers ATH Operazione Doppi Mersennes 2 2015-01-25 06:27
What are the Primality Tests ( not factoring! ) for Fermat Numbers? Erasmus Math 46 2014-08-08 20:05
LLT numbers, linkd with Mersenne and Fermat numbers T.Rex Math 4 2005-05-07 08:25
Two Primality tests for Fermat numbers T.Rex Math 2 2004-09-11 07:26
Fermat Numbers devarajkandadai Math 8 2004-07-27 12:27

All times are UTC. The time now is 11:31.


Sun Jul 3 11:31:03 UTC 2022 up 80 days, 9:32, 0 users, load averages: 1.73, 1.36, 1.35

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔