20051218, 22:48  #1 
Sep 2002
59_{16} Posts 
NEW MERSENNES AND MULTITHREADED SOFTWARE
It seams that intel and amd will have quad  16+core processors within the next 5 years, add that to duel processor computers which are fairly affordable and 4 processor systems which are in the reach of a few. Even a duel processor with 8 cores = 16 cores working on a number. A quad with 16 cores means throwing 64 cores at a mersenne number. with glucas and mlucas I wonder what the optimum number of cores are before they have no effect and is that software capable of searching for 100million digit numbers. Lets face it the day of the first 10million digit are coming to a close fast. Since single core computers are not getting faster anymore, the only logical next step for mersenne hunters is multithreaded software. Software writing talents need to try and make speed advancements in the multithreaded software thats out there for primes.

20051219, 01:05  #2 
Jul 2004
Nowhere
809 Posts 
Thats easier said then done.
What really needs to be found is a new method. LL is good to a point but if a new method was discovered that was easier to impliment on muticore systems wouldnt we be better off. 
20051219, 01:17  #3  
Jun 2005
2×191 Posts 
Quote:
If needed, FFT operations can be adapted to multicore systems using a divideandconquer approach. They won't make as efficient use of the processors as the current methods will, however, but it would still be worth it, especially as the exponents get larger. Drew 

20051219, 03:20  #4 
Sep 2002
1011001_{2} Posts 
According to a benchmark for a 3.8 p4 a 100 million digit number would take 6.3 years, if software could be improved so that a duel processor with 8 cores ea could finish a number in a year. That would be just about what the first 10 million digit numbers were taking when P95 was first able to handle them, at that time the hot machines were PIII  550. I rember redoing a duel PII350 tiger direct to duel PIII550'S so i could run them right away. I had been using the tiger for 2 years 24/7 before upgrading it in 99 to run the 10million digit numbers. The other important thing is save files that people can send in if they decide they no longer want to run the 100million digit number so that months of work aren't wasted.

20051219, 05:41  #5  
2×17×239 Posts 
wizard conjuring a spell
Quote:


20051219, 06:14  #6 
Jul 2004
Nowhere
809 Posts 
Yes i know that it its easier said then done. Im susgesting that it might be easier to find a new method then use ll in muti core envrioments. I belive from what i heard that ll relies on the results of the illiteration before to run the next test.

20051219, 06:26  #7  
Jun 2005
2×191 Posts 
Quote:
Consider a new method virtually impossible. Granted, you can't rule out the possibility, but it's certainly not easier than the alternative. You're right in that each step relies on the previous, but that doesn't mean you can't use parallel processing within an iteration. It will require some interprocesscommunication during each iteration, but since an iteration takes as long as tens of milliseconds, that's probably not an unreasonable amount of overhead. I suppose the tricky question is how do you distribute the work when other processes are running? You wouldn't want to hold up several cores waiting for the result from one that yielded cycles to another application. Drew Last fiddled with by drew on 20051219 at 06:35 

20051219, 19:56  #8  
Feb 2004
France
914_{10} Posts 
Quote:
So, the problem is to write a parallel FFT doing the squarring. Since I still do not understand how a sequential FFT works, don't expect me to talk about a // FFT ! Guillermo did such a work. And now Ernst. But this runs perfectly only on a pure SMP (very expensive, like the machines of IBM based on PPC). On cheaper machines, based on NUMA, it requires something even more complex: manage the memory so that threads using the same part of memory are located in the same block of processors. Not easy ... Tony 

20051219, 21:28  #9 
Jul 2004
Nowhere
809 Posts 
mabey we should look at these...
http://www.tgdaily.com/2005/12/19/ne...re_processors/ 
20051219, 22:06  #10  
Feb 2004
France
392_{16} Posts 
Quote:
What are you talking about in your previous email ? Mathematical method for proving primality or Mathematical/Computational method for coding the Mathematical method ? The (Mathematical method) LLT is the fastest for Mersenne. The (Mathematical method) FFT is the fastest for squarring, and has been specialized for Mersenne numbers. Parallelizing FFT is an improvement, but (as I said in previous email) either you have a very expensive really SMP machine or you have a less expensive NUMA machine. (In fact, SMP is NUMA with a very small NUMA factor). GLucas is parallelized FFT, but it is not scalable on NUMA machines with more than 10 or 12 CPUS. I don't know about GLucas on SMP machines. I guess experts can parallelize FFT in order to be scalable on large NUMA machines. But it would require a lot of work ! Tony 

20051220, 23:56  #11 
Nov 2005
South Carolina
7×11 Posts 
Stupid question time...
What, EXACTLY, and without lots of techspeak, is a dualcore machine? I thought it was like having two processors? Wouldn't a dualcore machine be better off running two numbers at once, and doubling throughput that way? 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Multithreaded factoring  bchaffin  Aliquot Sequences  8  20101024 13:38 
Stars and Mersennes  David John Hill Jr  Science & Technology  2  20091213 09:47 
MultiCore / MultiCPU Assignments (missing)  worknplay  Software  3  20081105 17:26 
Factoring Double mersennes  Citrix  Miscellaneous Math  2  20051004 08:08 
Prime95 a multithreaded application?  Unregistered  Software  10  20040611 05:31 