![]() |
|
![]() |
|
Thread Tools |
![]() |
#1 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
29·167 Posts |
![]()
The following is primarily about feasibility. There may well be errors. Hopefully any errors are not fatal to it. There probably are optimizations and impacts I haven't seen.
https://www.mersenne.org/various/math.php says, about prime95/mprime's P-1 factoring: "The P-1 method is quite simple. In stage 1 we pick a bound B1. P-1 factoring will find the factor q as long as all factors of k are less than B1 (k is called B1-smooth). We compute E - the product of all primes less than B1. Then we compute x = 3^(E*2*P). Finally, we check the GCD (x-1, 2^P-1) to see if a factor was found." If the above is a current and accurate description of the prime95 P1 computation process, or even if it is not in certain ways, it appears to offer some opportunities to employ the Jacobi symbol check in part of the prime95 P-1 factoring effectively. A quick review of parts of the prime95 v29.4b7 source code indicates the description might be somewhat more completely stated as: "We compute most of 2*p*E - the product of powers of all odd primes less than B1 and 2 and exponent P, up to a limit. Then we compute x= 3^(E*2*P) and in the process include the rest of the computation of E implicitly." So the full E or E*2*p term may be not available for a separate Jacobi check of it. I think most or all of it is precomputed for usual PrimeNet assignments. For example, for a B1 bound of 800,000, there are 63,951 primes smaller than the bound. These primes (except for 2?) raised to a power to produce a value just under B1 will have up to 20 bits each. Current P-1 assignment exponents ~88 million occupy 27 bits, and the factor 2 occupies one. So an estimate of number of bits in E*2*p is 63,950 * 20 + 28 = ~ 1,279,028 bits. This is Nbits / B1 = ~1.599 bits. For B1 1,000,000 and exponent 91 million, there are 78498 primes smaller than the B1 bound. 91 million also occupies 27 bits; 2 one, so E*2*p ~ 78497 * 20 + 28 =~ 1569968 bits; Nbits/B1 =~ 1.57. The prime95 v29.4b7 source says, in part: /* First restart point. Compute the big exponent (a multiple of small */ /* primes). Then compute 3^exponent. The exponent always contains 2*p. */ /* We only compute 1.5 * B bits (up to about 20 million bits). The rest of the */ /* exponentiation will be done one prime at a time in the second part of stage 1. */ So, about 15/16 (93.75%) - 15/15.7 (95.5%) of the bit total of a usual (E*2*P) is precomputed. (This will change somewhat as GIMPS progresses on exponents.) It's close enough to 100% that a small code change could lift it to 100% for some time to come for P-1 assignments being assigned just ahead of first-time primality checks. In CUDAPm1, stage one is also raising 3 to a power of the product of the primes below B1 (http://www.mersenneforum.org/showpos...1&postcount=20) Since it's doing the same math and was somewhat derived from prime95's approach, the following should also apply to it more or less. Unless CUDAPm1's math sufficiently deviates in the details to require a separate treatment, the following applies to CUDAPm1. (CUDAPm1 is hard to read code, at least for me; P-1 math plus CUDA plus not as many comments.) Any other software (gmp-ecm P-1 factoring, for instance http://www.mersenneforum.org/showthread.php?t=4087) may also be somewhat checkable with Jacobi symbols. Haven't looked at it at all for feasibility or whether it's already there. 1) Jacobi symbols are multiplicative if either top or bottom are constant. So (a/n) (b/n) = (ab/n) 2) From 1), the value of an even power of Jacobi symbol is +1 or 0 not -1; (a/n)(a/n)= (a^2/n) = 1 whether (a/n) is -1 or +1 (or it's 0 if there's a factor f>1 in common for a and n, including if a=n, n>1). 3) x = 3^(E*2*P) is always an even power of 3. 4) 3) and 2) in combination, give a Jacobi symbol (x/Mp) = 1 (or 0 if there's a factor of Mp in there), but not -1. Let a=3^(E*p), n=Mp. (a/n)(a/n) = 1 = (a^2/n) = ( ((3^(E*p))^2 / Mp) = ( (3^(E*2*p) / Mp) = ( x / Mp ) 5) x appears to be computed modulo Mp, for speed and size. So a natural denominator for the Jacobi is also Mp. Jacobi check time for (x/Mp) should be similar to what Preda has determined for the LL form of gpuOwL and similar size exponents which was under a minute for up to 78M bits for different model cpus. This is also consistent with Prime95 LL test Jacobi check times. 6) If E was fully computed, we could also try to determine the correct Jacobi (E/n) or ((E*2*p)/n) for some well chosen fixed n (n= M31 or perhaps n=M61, since it would fit in registers and be relatively prime to the smaller primes and exponent p; B1<=B2 which typically <<M31 for routine GIMPS P-1 factoring assignments). Doing so appears to require of order 80,000 individual small Jacobi computations for current P-1 Primenet assignments (for the small primes < B1~10^6), and combining them via 1). The speed of this computation would be high because the operands are word size, and speed could be improved by use of precomputed partial results and saving extensions to higher B1 as the program operates, since n is constant. That's only one side of the check on E; the other side is to compute (E/n) or ((E*2*p)/n) and compare. This should also be faster than Preda's timed Jacobi evaluations, since the size of E is ~1.6 million bits for current exponents, much shorter than the ~78 million of the PRP residue corresponding to Preda's timing. A Jacobi symbol check could be applied up to the limit computed explicitly. I'm not sure if this check on computation of E (or most of E or E*2*P) is worth doing. Returning to considering the more comprehensive Jacobi of 5): If (x/Mp) yields 1, it's consistent with the computation being correct. If (x/Mp) is value 0 instead of 1, there's a factor, or the Jacobi computation erred. In the 0 case, the Jacobi could be quickly rechecked. The gcd should reveal a factor. It should be of the correct forms; 1 or 7 mod 8, 2kp+1. If it does not find a factor, or it claims to have found a factor but the claimed factor does not pass the form checks, the gcd has erred. I've seen this happen sometimes in CUDAPm1. (The gcd could be retried immediately by the factoring program, provided it has retained a copy of the input data, since its computation is considerably shorter than the preceding computations. The retry could be attempted by a different library's gcd function than the first time, since a gcd computation error might be a software issue. Or the gcd's input could be saved to disk for later attempts by other means, or inspection for debugging.) If (x/Mp) yields Jacobi value -1, the P-1 calculation has erred, or the Jacobi check has. In the -1 case the Jacobi could be quickly retried. If the Jacobi repeatedly produced -1, the P-1 attempt or underlying hardware could be regarded as suspect. But. Does the Jacobi change due to the modulo Mp operation? Apparently not, since a modulo is part of calculating the Jacobi symbol, as the first step states. https://en.wikipedia.org/wiki/Jacobi..._Jacobi_symbol So, it appears stage one P-1 computation could be checked (before the -1 or gcd) with as little as a single Jacobi evaluation taking less than a minute on a decent cpu. If we could compute the Jacobi check on the cpu, and the bulk of the P-1 test on the gpu, in CUDAPm1, it would separate the computations in a way that somewhat guards against some system memory errors, gpu memory errors, cpu and gpu microcode bugs, etc. affecting both the P-1 and the Jacobi check. The above covers stage 1. Stage 1 for Mersenne numbers ~85 million exponent take ~3.25 hours (nearly 200 minutes) on a gtx1070, or around 12 hours on an i7-7500u chip (both cores). Checking it once or twice occupying half to one minute per check seems like a good trade. Especially when someone is wondering whether a dearth of factors in their gpu run history is due to statistics or due to error. The check could be made a user selectable option, to preserve current speed in the usual case, and employed when there is question about the reliability of the software or hardware, or a higher reliability P-1 result is desired. Stage 2 seems to be a different situation. The terms ( H^q - 1 ) in https://en.wikipedia.org/wiki/Pollar...92_1_algorithm are not even powers. The Jacobis are multiplicative but not additive. Therefore as far as I can tell, the correct Jacobi of the stage 2 product is not fixed as stage 1's (or the LL sequence) was, and would need to be determined for each bound and exponent. This would entail as many Jacobi evaluations as there were primes between B1 and B2, to determine a correct Jacobi product. Since B1 =~ 7*10^5 and B2 =~17*10^6, that's of order 10^6 Jacobi evaluations, one per prime between B1 and B2 (https://en.wikipedia.org/wiki/Prime-counting_function) at about .5 to 1 minute each, a prohibitively large computation cost at nearly a year to imperfectly check a P-1 computation of perhaps a day duration. So it seems a P-1 stage 2 Jacobi check is out of the question. (Unless there's some way to much more swiftly determine what the correct Jacobi would be for the stage 2 product.) However, checking at a late point in the stage 1 calculation, an intermediate value which is used also in the stage 2 calculation, helps safeguard accuracy of an aspect of stage 2's computation, at no additional computing cost beyond the stage 1 check. |
![]() |
![]() |
![]() |
#2 | |
"Robert Gerbicz"
Oct 2005
Hungary
1,429 Posts |
![]() Quote:
rewording a little (but you had the discovery): making an error check that detects all errors with (roughly) 50% chance in the first stage of the Pollard's (P-1) method ! we calculate in that stage res==a^E mod n, where E is the product of all prime powers <= B1, so if B1>1 and gcd(a,n)=1 (trivial condition), then (res/n)=1, because E is even, where here we used the Jacobi symbol. If at any powmod operation we make an error at the multiplication then by roughly 50% chance we get (res2/n)=-1, and suppose that we have done this mistake not at the first prime power (double check the a^(2^K) mod n calculation) , then the remaining primepowers are all odd, so when you calculate res2^E2 mod n and you won't do further mistake then you won't change the Jacobi symbol, and so like at the Jacobi test for LLT you can catch all errors with (roughly) 50% chance. A quick and dirty demo in PARI-Gp: Code:
perrortest(n,emult,B,perr)={r=Mod(3,n)^emult;forprime(p=2,B,q=p;while(q<=B/p,q*=p);r=r^q;if(p==perr,r=Mod(random(n),n))); if(kronecker(lift(r),n)!=1,return(0),return(1))} sum(i=1,1000,perrortest(2^113-1,2*113,10000,3511)) sum(i=1,1000,perrortest(1+2*3*5*7*11*13*17*19*random(2^100),1,10000,3511)) (18:58) gp > sum(i=1,1000,perrortest(2^113-1,2*113,10000,3511)) %95 = 507 (18:58) gp > sum(i=1,1000,perrortest(1+2*3*5*7*11*13*17*19*random(2^100),1,10000,3511)) %96 = 475 (18:58) gp > [we used kroncker test, because that is available in PARI, an extension of Jacobi, but Jacobi would be enough] in perrortest we use the fixed base=3, using emult if we know that all/some factor of (p-1) has a factor of emult, using B=B1 and making a random error at p=perr prime [note that at test perr should be an odd prime<=B, to make a real error]. We return by 0 if an error detected in the test, otherwise we return by 1. |
|
![]() |
![]() |
![]() |
#3 | |
"Robert Gerbicz"
Oct 2005
Hungary
101100101012 Posts |
![]() Quote:
This depends also on your method how you get powmod, say if you calculate a^17 mod n: a->a^2->a^4->a^8->a^16->a^17 (mod n eveything) and you make error in a^2 mod n, then you won't reuse it at multiplication, just only its square, so the (possible) Jacobi error will be corrected at a^17 mod n in every case. The successfull error detection rate could be somewhere(?) at 10-20% for large Mersenne numbers. |
|
![]() |
![]() |
![]() |
#4 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
113538 Posts |
![]()
I am confident that the Jacobi test is not yet applied in the prime95 program to P-1 factoring, as of v29.4b7, since I both reviewed the source code and asked the author what checks were included.
I'm rather confident the Jacobi symbol error check is not yet applied in the CUDAPm1 program, since error's post about using it for LL test checking occurred years after the last code development and maintenance for CUDAPm1, and there was no obvious sign of it in either the forum thread describing CUDAPm1's development, or in the CUDAPm1 source code. In other software, such as gmp-ecm, I have no idea if it's there or not. Thanks for reviewing it. I'll need more time to understand your retraction. However, even a 10-20% chance of error detection _might_ be worthwhile. Particularly in the case of a GIMPS participant questioning whether his installation is reliable, detecting a Jacobi error occasionally, at such low probabilities, is still higher than the probability of finding a factor, and the detection probability should occur whether there is a factor found or to be found by P-1, or not. (0.9 ^90 is a small number) For larger GIMPS exponents, P-1 factoring run times can be quite long. Hundred-megadigit exponents take days each on a GTX1070 for modest bounds, and run time scales roughly as >p^2, paralleling primality test run-times. Last fiddled with by kriesel on 2018-06-28 at 19:59 |
![]() |
![]() |
![]() |
#5 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
113538 Posts |
![]()
A check of all my P-1 results on GIMPS for the past year shows the following:
605 no-factor-found 9 stage 1 factor (as determined by B1=B2 in the report) Of these, one was selected to run knowing it already had a known factor, as a reliability test, so should be excluded. Stage 1 factors found 8/(8+9+605) = 1.29% 9 stage 2 factor (as determined by B2 !=B1 in the report) Stage 2 only runs if stage 1 does not find a factor. Stage 2 factors found in 9/(605+9)=1.47% Ratio found 1.47/1.29=1.14 So odds appear about equal stage 1 vs stage 2, and ~2.73% overall (17/622) Ratios are given to digits far exceeding what's justified, to avoid round-off error only. The statistics are very poor with only 8 or 9 factor-found samples per stage. Ten or a hundred times as many would be better. Last fiddled with by kriesel on 2018-06-29 at 13:49 |
![]() |
![]() |
![]() |
#6 |
"Robert Gerbicz"
Oct 2005
Hungary
1,429 Posts |
![]()
Now I understand why it isn't in the main programs (p95/gmp-ecm), the reason is that they are using a top-bottom way at powmod using sliding window technique (you can see that it is used in gmp also at powmod). And with this you're using only squaring in a row, then a single mulmod (using even a precalculated table), so the (possible) Jacobi error will be corrected in the next squaring.
You could do the powermod in the other direction, but in that case you'd lost in speed, maybe we'd use only your method if we can't store 3 (or 2) residues in memory at the precalculated table and base is large. |
![]() |
![]() |
![]() |
#7 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
484310 Posts |
![]() Quote:
a) the information in the Jacobi result is less than two bits since there are only 3 values. That's not enough to encode success or failure of several prior squarings or multiplies as well as the last operation (squaring or mul by a). b) even a rather significant error in the first squaring to kth squaring, where a^k+error effect < Mp, doesn't hurt the final P-2 factoring outcome. It's equivalent to starting with a different a, which is allowed. That's good; the math is inherently resistant to early errors. So there's no need, or payoff, to check for error very early in the powmod. Error in the power to which the base a is raised, however, is deadly in stage 1. E x 2 x p x random may be ok, but what I chose to test was E x 2 x p + random signed 31-bit value (such as a memory error). Last fiddled with by kriesel on 2018-07-02 at 19:43 |
|
![]() |
![]() |
![]() |
#8 |
"Mihai Preda"
Apr 2015
133410 Posts |
![]()
It is my impression that when computing 3^x (mod Mp) through left-to-right exponentiation (i.e. starting with the most significant bit), that the Jacobi check is not so useful, for this reason:
every step (every bit of "x") does a square followed sometimes by mul-by-3, and the square basically wipes out any previous error that would show in Jacobi. So the error, instead of persisting, is erased at every iteration. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
error during Jacobi check on 330,000,000+ exponent | evanh | Hardware | 5 | 2018-02-20 03:46 |
Endlessly Running Jacobi error check on v29.3 | emiller | Software | 10 | 2017-11-14 10:26 |
LL testing: Jacobi symbol of the (interim or final) residue minus 2 as error check | GP2 | Number Theory Discussion Group | 33 | 2017-08-21 22:14 |
Yet another new factoring algorithm\primality test:Digital Coding ?? | tServo | Miscellaneous Math | 3 | 2014-04-10 18:52 |
Paging whoever is P-1 trial-factoring 3.05M - 3.13M | GP2 | Data | 0 | 2003-11-04 08:43 |