mersenneforum.org Understanding CPUs on the Benchmark Page
 Register FAQ Search Today's Posts Mark Forums Read

 2016-01-26, 23:34 #1 Fred     "Ron" Jan 2016 Fitchburg, MA 97 Posts Understanding CPUs on the Benchmark Page Disclaimer: I'm NOT a mathematician. I've been spending the last few days learning about Mersenne Primes, GIMPS and becoming fascinated by the concept. Feel free to reply as though you're talking to a child. Can someone help me understand what I'm seeing on the CPU Benchmark page? Would the results on this page be helpful in selecting a processor for a system which would primarily run LL on Prime95 24/7? I noticed that the Intel Core i5-6600 @ 3.30GHz had the lowest numbers for all the middle colums, but I'm not sure exactly what's being represented there? I'm guessing the lower the better? For the $, this processor seems mid-range, so I was surprised at the results (if in fact those are good results)?  2016-01-27, 01:22 #2 bgbeuning Dec 2014 25510 Posts In the benchmark table, the number at the top ending in M are exponents in 2^p-1 . The Lucas-Lehmer (LL) test does p iterations. So when you see 68M to 77M, that means any p in that range will take that long per iteration. So the i5-6600 at 16.25 (milliseconds / iteration) will take 70,000,000 * 0.0165 = 1,155,000 seconds = 13 days to complete one LL test. (Someone please correct me if this is wrong.) There are numerous threads on this forum about building the most bang for the buck systems. Here is a recent one http://www.mersenneforum.org/showthread.php?t=20795 Welcome to GIMPS.  2016-01-27, 02:42 #3 Fred "Ron" Jan 2016 Fitchburg, MA 11000012 Posts Thanks for your response! So that all makes sense. The only part I'm still confused on I guess are the results for that particular processor, in comparison to all the other (more expensive, technically powerful) processors. For example, test results from an Intel Core i7-5820K @ 3.30GHz (which retails for about twice as much as the i5-6600) shows results that are almost half the speed. Could the i5-6600 just be a better fit for this type of math? How prevalent are test results from trolls, and could these results for the i5-6600 be bogus?  2016-01-27, 03:15 #4 VBCurtis "Curtis" Feb 2005 Riverside, CA 19×251 Posts Scroll further down, and you'll see the i5-6500 nearly as fast. It appears the new Skylake generation of Intel CPUs (those starting with 6) are indeed quite fast at LL-testing. DDR4 helps, for sure- the tests are often limited by memory speed rather than CPU speed. This memory bottleneck is one of the reasons it's more efficient to run a single test over all 4 cores instead of running 4 separate tests on one core each. There's threads aplenty about that optimization, too! CPUs starting with 6 use new DDR4 memory, rather than slower DDR3 that has been around a while. Luckily, DDR4 has been out over a year, so prices are pretty good. Read the bang-for-buck and/or cheap system threads for more discussion about which bits beyond the CPU matter for best production, as well as analysis of electricity costs per test. Welcome!  2016-01-27, 03:20 #5 ATH Einyen Dec 2003 Denmark 60628 Posts Make sure you click on the link to the specific processor, there seems to be a very big difference between timings on the same CPU. Currently the very fastest computers for LL tests are the big Xeon servers with 24-32 cores, but they are very expensive. After that it is probably the Haswell-E (until Broadwell-E arrives): 5960X, 5930K, 5820K, and then it is the new Skylake cpus 6700K, 6700, 6600 and so on. I have not researched this in detail but my hunch says the most speed per dollar is probably either low-end Haswell-E 5820K or the mid to high end Skylake processors. Haswell-E has quad channel memory which helps a lot versus dual channel for Skylake, and Haswell-E has 6 or 8 cores vs 2 to 4 cores on Skylake. Last fiddled with by ATH on 2016-01-27 at 03:22 2016-01-27, 04:00 #6 LaurV Romulan Interpreter Jun 2011 Thailand 25×5×59 Posts Quote:  Originally Posted by rleshane I noticed that the Intel Core i5-6600 @ 3.30GHz had the lowest numbers for all the middle colums, but I'm not sure exactly what's being represented there? I'm guessing the lower the better? For the$, this processor seems mid-range, so I was surprised at the results (if in fact those are good results)?
Scrolling down to the end of the table (or clicking on your CPU) and reading the last line may help. Those are the times in milliseconds, corresponding to each FFT size in the header. Of course, lower is faster/better. Please note that those are times per core, so a CPU with 8 cores will produce a double amount of work compared with one having 4 cores and the same clock and same numbers in the tables, assuming your memory bandwidth does not bottleneck it.

Quote:
 Originally Posted by rleshane Thanks for your response! So that all makes sense. The only part I'm still confused on I guess are the results for that particular processor, in comparison to all the other (more expensive, technically powerful) processors. For example, test results from an Intel Core i7-5820K @ 3.30GHz (which retails for about twice as much as the i5-6600) shows results that are almost half the speed. Could the i5-6600 just be a better fit for this type of math? How prevalent are test results from trolls, and could these results for the i5-6600 be bogus?
The trick here is AVX and FMA3, the newer toys have larger registers inside, and can do more byte-multiplications in the same time in parallel. So yes, they are better for this type of calculus.

Last fiddled with by LaurV on 2016-01-27 at 04:08

 2016-01-27, 15:27 #7 Fred     "Ron" Jan 2016 Fitchburg, MA 11000012 Posts Thanks to you all! Fantastic info, and exactly what I was looking for. Looks like I have lots more reading to do in the referenced threads. Thanks again!
2016-01-27, 17:05   #8
Serpentine Vermin Jar

Jul 2014

37×89 Posts

Quote:
 Originally Posted by rleshane Thanks to you all! Fantastic info, and exactly what I was looking for. Looks like I have lots more reading to do in the referenced threads. Thanks again!
It can be a fun education if you enjoy reading about all the details of CPUs, memory, etc.

Remember that the timings show the benchmark times (per iteration speeds, so lower is better) for a *single* core on the CPU.

As mentioned, depending on your memory setup (DDR3, DDR4, dual/quad channel, mem speed, etc) you may discover that it's better to assign all 4 cores of a CPU to a single worker, rather than trying to run 4 separate workers and flooding the memory channel with way too much stuff.

I'm hoping that some future and awesome version of Prime95 will do a memory timing of some sort and automatically set the optimal cores-per-worker / total-workers ... I see too many cases of someone running 4 workers, each one using one core, and doing large 100 million digit tests on each one. They run horrible that way. Heck, doesn't even have to be a 100M digit test... could just be 4 exponents in the 70-80M range and they take WAY longer than they really should if they were only running one of them (or had all cores on one worker).

Fortunately it's somewhat easy to tell... set each worker to have it's own window in the GUI and then let all of them run for a while to get an idea of the per iteration times. Then stop all but one and see how much (if at all) that one worker's timings change.

If you're memory bandwidth starved, stopping all but one should make that one workers timings run MUCH faster.

Running more than one core in a worker doesn't quite scale linearly... adding 2 cores to a worker will not double your speed, but it can get close. Ernst has been working hard at making mlucas (a Linux alternative to Prime95) scale up. Mr. Prime95 himself has said that it might have to do with the way chunks of the FFT are distributed among cores... it might not do it evenly between them all so there could be periods when some core or another has nothing to do.

In general though it scales well enough, and if you have that memory bandwidth issue you'll probably get better overall throughput by doing it that way.

Hope that helps... in other words, bang for buck you might look at more, but slower, cores on a chip rather than faster but fewer cores. If money is no object, get both... faster and more cores.

And get the fastest memory subsystem you can. Right now that would be any CPU that supports quad channel DDR4, at whatever the fastest speed it can do. It doesn't have to be a lot of RAM for just doing LL tests, so it could be 4 x 1GB modules if you didn't need oodles of gigabytes and wanted to save a bit of cash on the memory side. More RAM really only helps (in regards to Prime95 alone) if you're doing certain types of ECM or P-1 factoring.

2016-01-27, 20:55   #9
Gordon

Nov 2008

3·167 Posts

Quote:
 Originally Posted by Madpoo It can be a fun education if you enjoy reading about all the details of CPUs, memory, etc. [snip] More RAM really only helps (in regards to Prime95 alone) if you're doing certain types of ECM or P-1 factoring.
Yeah, *only* 32gig doesn't go far when running multiple instances of ECM - yes I do know about maxmem but it REALLY slows things down, and quite often fails anyway trying to allocate just 4gig of ram when there is over 20 free...

 2016-01-27, 23:40 #10 Fred     "Ron" Jan 2016 Fitchburg, MA 11000012 Posts So, after taking everything into consideration, I'm thinking of seeing how much bang for my buck I can get for around \$300. I'm thinking about replicating this build as the budget allows over time to scale up for more processing, making a home-made case of some sort to safely house the motherboards. http://pcpartpicker.com/p/LDfPjX Thoughts? I know the memory speed isn't ideal (DDR4-2133). To upgrade the motherboard and memory to handle DDR4-3200 would add about 33% to the overall cost, and my though is it probably would not yield 33% increase in LL processing? You'll notice there is no hard drive. I have a bunch of crappy old 160gb hard drives. I don't see much talk of hard drives being an issue in the threads I've read so far. Is it safe to assume that hard drive speed would not be a bottleneck with a system like the one I'm proposing (running Prime95 LL tests on Linux)?
2016-01-28, 00:06   #11
Xyzzy

"Mike"
Aug 2002

2×72×83 Posts

Quote:
 Originally Posted by rleshane You'll notice there is no hard drive. I have a bunch of crappy old 160gb hard drives. I don't see much talk of hard drives being an issue in the threads I've read so far. Is it safe to assume that hard drive speed would not be a bottleneck with a system like the one I'm proposing (running Prime95 LL tests on Linux)?
We have run mprime with a 4GB USB key and Linux, and that was more than enough room.

You would only need 2×512MB RAM as well. (Do they make it that small any more?)

 Similar Threads Thread Thread Starter Forum Replies Last Post Fred PrimeNet 3 2016-05-19 13:40 Demonslay335 YAFU 11 2016-01-08 17:52 Raman Math 4 2012-05-24 05:37 zacariaz Homework Help 32 2007-05-16 15:18 drew Software 6 2006-07-05 17:48

All times are UTC. The time now is 23:04.

Wed May 5 23:04:25 UTC 2021 up 27 days, 17:45, 0 users, load averages: 1.71, 1.72, 1.77