![]() |
![]() |
#452 |
P90 years forever!
Aug 2002
Yeehaw, FL
11111110101012 Posts |
![]() |
![]() |
![]() |
![]() |
#453 | |
If I May
"Chris Halsall"
Sep 2002
Barbados
11,059 Posts |
![]() Quote:
For the latter tools are available, for development, testing, and QA purposes. Code:
chalsall@hobbit:~$ echo '{"hello":"World"}' | jq { "hello": "World" } Last fiddled with by chalsall on 2021-09-30 at 20:34 Reason: s/, humans/, not humans/; # Inverses should *not* be inferred... |
|
![]() |
![]() |
![]() |
#454 | |
Bemusing Prompter
"Danny"
Dec 2002
California
32×277 Posts |
![]() Quote:
Last fiddled with by ixfd64 on 2021-09-30 at 22:41 |
|
![]() |
![]() |
![]() |
#455 |
"University student"
May 2021
Beijing, China
22·67 Posts |
![]()
The time complexity is almost linear logarithmic, O(log(ab)).
Code:
int gcd(int x,int y ) { if(x < y) return gcd(y,x); // x>y if( y == 0) return x; // if y=0, x is GCD else { if( !(x%2) ) { if( !(y%2) ) //x,y both even return 2*gcd(x >> 1, y >> 1); else // x is even, y is odd return gcd(x >> 1, y ); } else { if( !(y%2) ) // x is odd, y is even return gcd(x, y >> 1); else // x, y both odd return gcd(y,x-y); } } } Last fiddled with by Zhangrc on 2021-10-01 at 12:30 |
![]() |
![]() |
![]() |
#456 |
Dec 2002
25×33 Posts |
![]()
What cause of action would be recommended for this core segmentation fault? Nothing else was running at the time.
Code:
[Comm thread Oct 1 13:40] Sending result to server: UID: Tha/Z-170, M9262289 completed P-1, B1=2000000, B2=146000000, Wi8: E7182B42 [Comm thread Oct 1 13:40] [Work thread Oct 1 13:40] [Work thread Oct 1 13:40] P-1 on M9194659 with B1=5000000, B2=TBD [Work thread Oct 1 13:40] Setting affinity to run helper thread 1 on CPU core #2 [Work thread Oct 1 13:40] Using FMA3 FFT length 480K, Pass1=384, Pass2=1280, clm=4, 4 threads [Work thread Oct 1 13:40] Setting affinity to run helper thread 2 on CPU core #3 [Work thread Oct 1 13:40] Setting affinity to run helper thread 3 on CPU core #4 [Comm thread Oct 1 13:40] PrimeNet success code with additional info: [Comm thread Oct 1 13:40] CPU credit is 3.9457 GHz-days. [Comm thread Oct 1 13:40] Done communicating with server. [Work thread Oct 1 13:48] M9194659 stage 1 is 13.85% complete. Time: 458.543 sec. [Work thread Oct 1 13:56] M9194659 stage 1 is 27.71% complete. Time: 454.645 sec. [Work thread Oct 1 14:03] M9194659 stage 1 is 41.58% complete. Time: 458.479 sec. [Work thread Oct 1 14:11] M9194659 stage 1 is 55.44% complete. Time: 456.359 sec. [Work thread Oct 1 14:18] M9194659 stage 1 is 69.30% complete. Time: 460.225 sec. [Work thread Oct 1 14:26] M9194659 stage 1 is 83.16% complete. Time: 436.106 sec. [Work thread Oct 1 14:33] M9194659 stage 1 is 97.02% complete. Time: 436.115 sec. [Work thread Oct 1 14:35] M9194659 stage 1 complete. 14429932 transforms. Time: 3254.242 sec. [Work thread Oct 1 14:35] With trial factoring done to 2^71, optimal B2 is 91*B1 = 455000000. [Work thread Oct 1 14:35] If no prior P-1, chance of a new factor is 8.92% [Work thread Oct 1 14:35] D: 1050, relative primes: 2848, stage 2 primes: 23755332, pair%=94.69 [Work thread Oct 1 14:35] D: 1050, relative primes: 2819, stage 2 primes: 23755332, pair%=94.61 [Work thread Oct 1 14:35] Using 11021MB of memory. [Work thread Oct 1 14:35] Stage 2 init complete. 27785 transforms. Time: 16.280 sec. [Work thread Oct 1 14:44] M9194659 stage 2 is 7.41% complete. Time: 526.239 sec. [Work thread Oct 1 14:52] M9194659 stage 2 is 14.96% complete. Time: 526.555 sec. [Work thread Oct 1 15:01] M9194659 stage 2 is 22.61% complete. Time: 526.607 sec. [Work thread Oct 1 15:10] M9194659 stage 2 is 30.29% complete. Time: 526.931 sec. [Work thread Oct 1 15:19] M9194659 stage 2 is 37.96% complete. Time: 526.598 sec. [Work thread Oct 1 15:27] M9194659 stage 2 is 45.63% complete. Time: 526.763 sec. [Work thread Oct 1 15:36] M9194659 stage 2 is 53.28% complete. Time: 526.889 sec. [Work thread Oct 1 15:45] M9194659 stage 2 is 60.82% complete. Time: 526.498 sec. [Work thread Oct 1 15:54] M9194659 stage 2 is 68.22% complete. Time: 526.293 sec. [Work thread Oct 1 16:03] M9194659 stage 2 is 75.58% complete. Time: 526.346 sec. [Work thread Oct 1 16:11] M9194659 stage 2 is 82.96% complete. Time: 532.123 sec. [Work thread Oct 1 16:20] M9194659 stage 2 is 90.33% complete. Time: 526.133 sec. [Work thread Oct 1 16:29] M9194659 stage 2 is 97.73% complete. Time: 526.361 sec. [Work thread Oct 1 16:32] M9194659 stage 2 complete. 26612490 transforms. Time: 7011.609 sec. [Work thread Oct 1 16:32] Stage 2 GCD complete. Time: 1.302 sec. [Work thread Oct 1 16:32] P-1 found a factor in stage #2, B1=5000000, B2=455000000. [Work thread Oct 1 16:32] M9194659 has a factor: 1184258844011627168698855977003308763574671960534764772350463733066848456139362163010122679897458456388791012515229078287455071 (P-1, B1=5000000, B2=455000000) [Comm thread Oct 1 16:32] Sending result to server: UID: Tha/Z-170, M9194659 has a factor: 1184258844011627168698855977003308763574671960534764772350463733066848456139362163010122679897458456388791012515229078287455071 (P-1, B1=5000000, B2=455000000) [Comm thread Oct 1 16:32] [Work thread Oct 1 16:32] [Work thread Oct 1 16:32] P-1 on M9262387 with B1=2000000, B2=TBD [Work thread Oct 1 16:32] Setting affinity to run helper thread 1 on CPU core #2 [Work thread Oct 1 16:32] Using FMA3 FFT length 480K, Pass1=384, Pass2=1280, clm=4, 4 threads [Work thread Oct 1 16:32] Setting affinity to run helper thread 2 on CPU core #3 [Work thread Oct 1 16:32] Setting affinity to run helper thread 3 on CPU core #4 [Comm thread Oct 1 16:32] PrimeNet success code with additional info: [Comm thread Oct 1 16:32] Composite factor 1184258844011627168698855977003308763574671960534764772350463733066848456139362163010122679897458456388791012515229078287455071 = 367786361 * 201803650751527169 * 267438738796927 * 101994518728598164687111 * 441343633 * 422426099229769 * 23010527886744119 * 62376566657 * 2185975009297 [Comm thread Oct 1 16:32] Already have factor 367786361 for M9194659 [Comm thread Oct 1 16:32] Already have factor 201803650751527169 for M9194659 [Comm thread Oct 1 16:32] Already have factor 267438738796927 for M9194659 [Comm thread Oct 1 16:32] Already have factor 441343633 for M9194659 [Comm thread Oct 1 16:32] Already have factor 422426099229769 for M9194659 [Comm thread Oct 1 16:32] Already have factor 23010527886744119 for M9194659 [Comm thread Oct 1 16:32] Already have factor 62376566657 for M9194659 [Comm thread Oct 1 16:32] Already have factor 2185975009297 for M9194659 [Comm thread Oct 1 16:32] CPU credit is 11.8294 GHz-days. Segmentation fault (core dumped) henk@Z170:~/mersenne$ ^C Code:
Linux64,Prime95,v30.6,build 3 Your choice: 14 Consult readme.txt prior to changing any of these settings. Temporary disk space limit in GB/worker (6.000000): Daytime P-1/P+1/ECM stage 2 memory in GB (10.800000): Nighttime P-1/P+1/ECM stage 2 memory in GB (10.800000): Upload bandwidth limit in Mbps (5.000000): Upload large files time period start (00:00): Upload large files time period end (24:00): Download limit for certification work in MB/day (400): Skip advanced resource settings (Y): |
![]() |
![]() |
![]() |
#457 | ||
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
11100101101102 Posts |
![]() Quote:
Now consider the usual GIMPS case where one of the operands is Mp, p~108, requiring millions of words to store, not just one, and even simple operations may take time ~proportional to p to execute once. The other starting operand is 3large_power Mod Mp, so ~p-1 bits large on average. Also note that potential factors are typically >264 before P-1 is attempted, due to prior TF depth, so the final operands are likely multi-word also. Even assuming the operands are stored in multi-word arrays of >1M packed 64-bit-unsigned initially, one can imagine dividing x>>1 into some small m parallel threads for most of the work, and alternatively computing a new least significant word and bit offset and mask. I don't see how you avoid at least proportional to p iterations, since the algorithm posted removes one bit out of p bits per pass. What is the run time for multiprecision general subtraction, o(n) each where n is the number of bits in the lesser operand? Anything less than ~p2.1 scaling (really O(p2 log p log log p)) makes my case stronger, that scaling downward from larger exponents to the wavefront exponents indicates significant single-core-GCD time at the wavefront. GIMPS code parallelizes individual iterations of PRP or LL or P-1 powering. The fact that one iteration depends on the previous does not preclude parallelizing in iterations or individual operations within iterations, and gaining considerably in speed with available numbers of processor cores. (Four cores / task optimal throughput is routine in CPU-based code prime95, Mlucas. GPU code uses much higher parallelism for the same algorithms.) Making use of parallelism to speed individual iterations of GCD is possible, as a quick web search reveals: https://www.sciencedirect.com/scienc...70866707000585. No one's coded it yet for GIMPS, choosing to focus optimization efforts on larger portions of the P-1 factoring computation first. That paper gives o(n / log n) for n1+e processors "where n is the bit-length of the larger input" which is p for Mp. That is a LOT of processors. GPUs have lots of processors, but not that many. Converting that back to sequential not parallel gives o(p2+e/log p). There are probably slight savings to be had based on knowledge of Mp and its possible factors always being odd, and knowing an Mp-oriented GCD can always be called with a specific operand order, avoiding the initial x>y test. The order of GMP's GCD is given in https://oaciss.uoregon.edu/icpp18/pu...23s2-file1.pdf as Quote:
http://www.iaeng.org/IJCS/issues_v42...CS_42_4_01.pdf is a somewhat different approach, where many GCDs are done in parallel on CPU or GPU. Last fiddled with by kriesel on 2021-10-01 at 15:27 |
||
![]() |
![]() |
![]() |
#458 |
P90 years forever!
Aug 2002
Yeehaw, FL
29·281 Posts |
![]() |
![]() |
![]() |
![]() |
#459 |
"University student"
May 2021
Beijing, China
1000011002 Posts |
![]() |
![]() |
![]() |
![]() |
#460 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·3·52·72 Posts |
![]()
In gmp, & gmplib, which IIRC is used in prime95 / mprime, gpuowl, and now also Mlucas v20.x, P-1 GCD (and probably P+1 GCD in mprime/ prime95), per https://gmplib.org/manual/Nomenclature-and-Types
"A limb means the part of a multi-precision number that fits in a single machine word. (We chose this word because a limb of the human body is analogous to a digit, only larger, and containing several digits.) Normally a limb is 32 or 64 bits." https://gmplib.org/manual/Binary-GCD "At small sizes GMP uses an O(N^2) binary style GCD." where N is number of limbs per operand. "Currently, the binary algorithm is used for GCD only when N < 3." So x,y each less than 129 bits at most. https://gmplib.org/manual/Lehmer_0027s-Algorithm Per iteration, reduces inputs by almost a word size. "The resulting algorithm is asymptotically O(N^2), just as the Euclidean algorithm and the binary algorithm." where N is number of limbs per operand. For Mp of interest to GIMPS wavefront activity, N= p/limbsize = ~10^8/64 or /32. https://gmplib.org/manual/Subquadratic-GCD says "For inputs larger than GCD_DC_THRESHOLD, GCD is computed via the HGCD (Half GCD) function, as a generalization to Lehmer’s algorithm." "The asymptotic running time of both HGCD and GCD is O(M(N)*log(N)), where M(N) is the time for multiplying two N-limb numbers." and where N is number of limbs per operand. M(Mp) is O(p log p log log p) as per Knuth and many other sources for large operands using fft methods. Log2 Mp is ~p; N is p/64 or p/32 depending on unsigned-64-bit or 32-bit in gmp; log N is log (p-64 or p-32) = ~log p for large p such as the >10^8 bit operands common in GIMPS first test or preparatory P-1. So O(M(N)*log(N)) ~ O(p log p log log p * log p) = O(p (log p)^2 log log p) (Hmm, where does one find or determine a value for GCD_DC_THRESHOLD?) https://www.ams.org/journals/mcom/20...17-0/home.html confirms asymptotic run time O(n (log n)^2 log log n) where n is number of bits of an operand. Last fiddled with by kriesel on 2021-10-02 at 14:58 |
![]() |
![]() |
![]() |
#461 | |
"Seth"
Apr 2019
1110111012 Posts |
![]() Quote:
You can find thresholds at https://gmplib.org/devel/thres/ GCD_DC_THRESHOLD seems to be around 300-400 depending on platform. Last fiddled with by SethTro on 2021-10-02 at 18:26 |
|
![]() |
![]() |
![]() |
#462 |
Dec 2002
25×33 Posts |
![]() |
![]() |
![]() |