20111204, 19:41  #1 
4,051 Posts 
Parallelizing Pollard's P1 Algorithm
I would like to somehow parallelize pollard's p1 algorithm over multiple machines. I am currently writing a project proposal and I don't know if it is possible or not.
I heard about the GIMPS project and, since it is distributed, I figured that you all would know best if this project is even possible. For Pollard's P1, does GIMPS distribute a single job over multiple machines (in order to get speedup)? Or, does it give each machine a *different* factoring job? Is Pollard's P1 algorithm parallizable? 
20111204, 20:27  #2  
Jun 2003
1169_{10} Posts 
Quote:
In principle, the two stages could be done by different machines, or a stage could be started on one machine and completed on another. GIMPS does not do this, largely because of the difficulty of transferring, storing, and tracking large amounts of data. Quote:


20111204, 21:28  #3 
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 89<O<88
3×29×83 Posts 
I would say rather than parallelize over multiple computers, you might try trying the parallelize it for GPU's; they have literally hundreds of processors going at once, and GIMPSters certainly would love a GPU P1 program.
Mr. P1, when you say S1 isn't parallelizable, do you mean in the same way as LL, meaning you'll lose efficiency but it's possible, or do you truly mean not at all? 
20111205, 02:50  #4 
"Richard B. Woods"
Aug 2002
Wisconsin USA
2^{2}×3×641 Posts 

20111205, 03:50  #5 
Jun 2003
1533_{16} Posts 
Stage1 is essentially a serial computation. You _can_ use multiple threads to accelerate the stage1 multiplications (a la prime95), but there is a limit to how much speedup you can achieve that way.
OTOH, you can use multiple independent processes to parallelize stage2, with virtually no limit. Stage2 is what you call "embarrassingly" parallel. Last fiddled with by axn on 20111205 at 03:50 
20111205, 04:14  #6  
"Richard B. Woods"
Aug 2002
Wisconsin USA
2^{2}×3×641 Posts 
No, it is _not_ essentially serial. It's just a bunch of modular multiplications, with the final product _not_ dependent on the order in which the multiplications are done!
In LL, there's that pesky 2 in each iteration that prevents us from being able to figure far enough in advance the exact values to be squared in iterations enough ahead to be usefully paralleled. There's no such barrier to forecasting any given value to be multiplied in P1 stage 1, that I know of. Quote:
Quote:
Please explain, exactly, the supposed impossibility that prevents paralleliizng stage 1's multiplications. Last fiddled with by cheesehead on 20111205 at 04:27 

20111205, 04:24  #7  
Jun 2003
3^{4}×67 Posts 
Quote:
a*b*c*d = (a*b) * (c*d)  parts in bracket can execute in parallel. ((a^b)^c)^d)  strictly serial. Quote:
Even without the 2, LL test is a long series of squarings, with each iteration dependent on the outcome of the previous  inherently serial. 

20111205, 04:42  #8  
9,001 Posts 
Quote:
But at the moment I have two options: 1) Find some problem to solve using multiple CPU cores or 2) Find some problem to solve by using multiple machines across the network. I would really love to write a GPU program but I have no experience with CUDA. We are using OpenMP for our project. Hopefully after gaining experience with writing parallel programs I will be able to do write them for the GPU next. Thanks for all the replies. They give me more confidence that it is at least possible to parallelize some parts of Pollard's algorithm. I wanted to pick a cool problem to solve but I didn't know if it was too advanced. 

20111205, 04:46  #9 
Jul 2003
So Cal
2×7×181 Posts 
Stage 1 is basically a big modular exponentiation. The time is dominated by two large multiplications per step assuming Montgomery or Barrett, so for GIMPSsized numbers GPUs should be decent. The multiplications can be done using cuFFT as in CudaLucas. Perhaps I'm overlooking something?

20111205, 04:52  #10 
"Richard B. Woods"
Aug 2002
Wisconsin USA
2^{2}·3·641 Posts 
Perhaps the argument that y'all should be making is not that stage 1 isn't parallelizable, but that the degree of parallelization required to accomplish a net savings in elapsed time (vs. serial processing) is exponential (or something), and thus soon becomes impractical in the general case?
Last fiddled with by cheesehead on 20111205 at 05:01 
20111205, 04:57  #11  
"Richard B. Woods"
Aug 2002
Wisconsin USA
2^{2}·3·641 Posts 
Quote:
:) Last fiddled with by cheesehead on 20111205 at 05:06 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Pollard rho with known factor of P1  henryzz  Math  2  20170815 12:13 
Pollard Rho Discrete Log  rogue  Math  6  20120926 11:20 
Problems with the Pollard Rho Algorithm  theta  Programming  2  20050826 00:26 
Pollard Rho Help?  theta  Factoring  2  20050823 21:14 
Pollard's Algorithm & That of Devaraja comparison  devarajkandadai  Miscellaneous Math  22  20050610 11:13 