mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2016-01-26, 23:34   #1
Fred
 
Fred's Avatar
 
"Ron"
Jan 2016
Fitchburg, MA

97 Posts
Default Understanding CPUs on the Benchmark Page

Disclaimer: I'm NOT a mathematician. I've been spending the last few days learning about Mersenne Primes, GIMPS and becoming fascinated by the concept. Feel free to reply as though you're talking to a child.

Can someone help me understand what I'm seeing on the CPU Benchmark page? Would the results on this page be helpful in selecting a processor for a system which would primarily run LL on Prime95 24/7?

I noticed that the Intel Core i5-6600 @ 3.30GHz had the lowest numbers for all the middle colums, but I'm not sure exactly what's being represented there? I'm guessing the lower the better? For the $, this processor seems mid-range, so I was surprised at the results (if in fact those are good results)?
Fred is offline   Reply With Quote
Old 2016-01-27, 01:22   #2
bgbeuning
 
Dec 2014

25510 Posts
Default

In the benchmark table, the number at the top ending in M are exponents in 2^p-1 .
The Lucas-Lehmer (LL) test does p iterations. So when you see 68M to 77M,
that means any p in that range will take that long per iteration.

So the i5-6600 at 16.25 (milliseconds / iteration) will take
70,000,000 * 0.0165 = 1,155,000 seconds = 13 days

to complete one LL test. (Someone please correct me if this is wrong.)

There are numerous threads on this forum about building the most
bang for the buck systems. Here is a recent one

http://www.mersenneforum.org/showthread.php?t=20795

Welcome to GIMPS.
bgbeuning is offline   Reply With Quote
Old 2016-01-27, 02:42   #3
Fred
 
Fred's Avatar
 
"Ron"
Jan 2016
Fitchburg, MA

11000012 Posts
Default

Thanks for your response! So that all makes sense. The only part I'm still confused on I guess are the results for that particular processor, in comparison to all the other (more expensive, technically powerful) processors.

For example, test results from an Intel Core i7-5820K @ 3.30GHz (which retails for about twice as much as the i5-6600) shows results that are almost half the speed. Could the i5-6600 just be a better fit for this type of math? How prevalent are test results from trolls, and could these results for the i5-6600 be bogus?
Fred is offline   Reply With Quote
Old 2016-01-27, 03:15   #4
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

19×251 Posts
Default

Scroll further down, and you'll see the i5-6500 nearly as fast. It appears the new Skylake generation of Intel CPUs (those starting with 6) are indeed quite fast at LL-testing.

DDR4 helps, for sure- the tests are often limited by memory speed rather than CPU speed. This memory bottleneck is one of the reasons it's more efficient to run a single test over all 4 cores instead of running 4 separate tests on one core each. There's threads aplenty about that optimization, too! CPUs starting with 6 use new DDR4 memory, rather than slower DDR3 that has been around a while. Luckily, DDR4 has been out over a year, so prices are pretty good.

Read the bang-for-buck and/or cheap system threads for more discussion about which bits beyond the CPU matter for best production, as well as analysis of electricity costs per test.
Welcome!
VBCurtis is offline   Reply With Quote
Old 2016-01-27, 03:20   #5
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

60628 Posts
Default

Make sure you click on the link to the specific processor, there seems to be a very big difference between timings on the same CPU.

Currently the very fastest computers for LL tests are the big Xeon servers with 24-32 cores, but they are very expensive. After that it is probably the Haswell-E (until Broadwell-E arrives): 5960X, 5930K, 5820K, and then it is the new Skylake cpus 6700K, 6700, 6600 and so on.

I have not researched this in detail but my hunch says the most speed per dollar is probably either low-end Haswell-E 5820K or the mid to high end Skylake processors. Haswell-E has quad channel memory which helps a lot versus dual channel for Skylake, and Haswell-E has 6 or 8 cores vs 2 to 4 cores on Skylake.

Last fiddled with by ATH on 2016-01-27 at 03:22
ATH is online now   Reply With Quote
Old 2016-01-27, 04:00   #6
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

25×5×59 Posts
Default

Quote:
Originally Posted by rleshane View Post
I noticed that the Intel Core i5-6600 @ 3.30GHz had the lowest numbers for all the middle colums, but I'm not sure exactly what's being represented there? I'm guessing the lower the better? For the $, this processor seems mid-range, so I was surprised at the results (if in fact those are good results)?
Scrolling down to the end of the table (or clicking on your CPU) and reading the last line may help. Those are the times in milliseconds, corresponding to each FFT size in the header. Of course, lower is faster/better. Please note that those are times per core, so a CPU with 8 cores will produce a double amount of work compared with one having 4 cores and the same clock and same numbers in the tables, assuming your memory bandwidth does not bottleneck it.

Quote:
Originally Posted by rleshane View Post
Thanks for your response! So that all makes sense. The only part I'm still confused on I guess are the results for that particular processor, in comparison to all the other (more expensive, technically powerful) processors.

For example, test results from an Intel Core i7-5820K @ 3.30GHz (which retails for about twice as much as the i5-6600) shows results that are almost half the speed. Could the i5-6600 just be a better fit for this type of math? How prevalent are test results from trolls, and could these results for the i5-6600 be bogus?
The trick here is AVX and FMA3, the newer toys have larger registers inside, and can do more byte-multiplications in the same time in parallel. So yes, they are better for this type of calculus.

Last fiddled with by LaurV on 2016-01-27 at 04:08
LaurV is offline   Reply With Quote
Old 2016-01-27, 15:27   #7
Fred
 
Fred's Avatar
 
"Ron"
Jan 2016
Fitchburg, MA

11000012 Posts
Default

Thanks to you all! Fantastic info, and exactly what I was looking for. Looks like I have lots more reading to do in the referenced threads. Thanks again!
Fred is offline   Reply With Quote
Old 2016-01-27, 17:05   #8
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

37×89 Posts
Default

Quote:
Originally Posted by rleshane View Post
Thanks to you all! Fantastic info, and exactly what I was looking for. Looks like I have lots more reading to do in the referenced threads. Thanks again!
It can be a fun education if you enjoy reading about all the details of CPUs, memory, etc.

Remember that the timings show the benchmark times (per iteration speeds, so lower is better) for a *single* core on the CPU.

As mentioned, depending on your memory setup (DDR3, DDR4, dual/quad channel, mem speed, etc) you may discover that it's better to assign all 4 cores of a CPU to a single worker, rather than trying to run 4 separate workers and flooding the memory channel with way too much stuff.

I'm hoping that some future and awesome version of Prime95 will do a memory timing of some sort and automatically set the optimal cores-per-worker / total-workers ... I see too many cases of someone running 4 workers, each one using one core, and doing large 100 million digit tests on each one. They run horrible that way. Heck, doesn't even have to be a 100M digit test... could just be 4 exponents in the 70-80M range and they take WAY longer than they really should if they were only running one of them (or had all cores on one worker).

Fortunately it's somewhat easy to tell... set each worker to have it's own window in the GUI and then let all of them run for a while to get an idea of the per iteration times. Then stop all but one and see how much (if at all) that one worker's timings change.

If you're memory bandwidth starved, stopping all but one should make that one workers timings run MUCH faster.

Running more than one core in a worker doesn't quite scale linearly... adding 2 cores to a worker will not double your speed, but it can get close. Ernst has been working hard at making mlucas (a Linux alternative to Prime95) scale up. Mr. Prime95 himself has said that it might have to do with the way chunks of the FFT are distributed among cores... it might not do it evenly between them all so there could be periods when some core or another has nothing to do.

In general though it scales well enough, and if you have that memory bandwidth issue you'll probably get better overall throughput by doing it that way.

Hope that helps... in other words, bang for buck you might look at more, but slower, cores on a chip rather than faster but fewer cores. If money is no object, get both... faster and more cores.

And get the fastest memory subsystem you can. Right now that would be any CPU that supports quad channel DDR4, at whatever the fastest speed it can do. It doesn't have to be a lot of RAM for just doing LL tests, so it could be 4 x 1GB modules if you didn't need oodles of gigabytes and wanted to save a bit of cash on the memory side. More RAM really only helps (in regards to Prime95 alone) if you're doing certain types of ECM or P-1 factoring.
Madpoo is offline   Reply With Quote
Old 2016-01-27, 20:55   #9
Gordon
 
Gordon's Avatar
 
Nov 2008

3·167 Posts
Default

Quote:
Originally Posted by Madpoo View Post
It can be a fun education if you enjoy reading about all the details of CPUs, memory, etc.

[snip]

More RAM really only helps (in regards to Prime95 alone) if you're doing certain types of ECM or P-1 factoring.
Yeah, *only* 32gig doesn't go far when running multiple instances of ECM - yes I do know about maxmem but it REALLY slows things down, and quite often fails anyway trying to allocate just 4gig of ram when there is over 20 free...
Gordon is offline   Reply With Quote
Old 2016-01-27, 23:40   #10
Fred
 
Fred's Avatar
 
"Ron"
Jan 2016
Fitchburg, MA

11000012 Posts
Default

So, after taking everything into consideration, I'm thinking of seeing how much bang for my buck I can get for around $300. I'm thinking about replicating this build as the budget allows over time to scale up for more processing, making a home-made case of some sort to safely house the motherboards.

http://pcpartpicker.com/p/LDfPjX

Thoughts? I know the memory speed isn't ideal (DDR4-2133). To upgrade the motherboard and memory to handle DDR4-3200 would add about 33% to the overall cost, and my though is it probably would not yield 33% increase in LL processing?

You'll notice there is no hard drive. I have a bunch of crappy old 160gb hard drives. I don't see much talk of hard drives being an issue in the threads I've read so far. Is it safe to assume that hard drive speed would not be a bottleneck with a system like the one I'm proposing (running Prime95 LL tests on Linux)?
Fred is offline   Reply With Quote
Old 2016-01-28, 00:06   #11
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2×72×83 Posts
Default

Quote:
Originally Posted by rleshane View Post
You'll notice there is no hard drive. I have a bunch of crappy old 160gb hard drives. I don't see much talk of hard drives being an issue in the threads I've read so far. Is it safe to assume that hard drive speed would not be a bottleneck with a system like the one I'm proposing (running Prime95 LL tests on Linux)?
We have run mprime with a 4GB USB key and Linux, and that was more than enough room.

You would only need 2×512MB RAM as well. (Do they make it that small any more?)

Xyzzy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Understanding assignment rules Fred PrimeNet 3 2016-05-19 13:40
Understanding NFS Demonslay335 YAFU 11 2016-01-08 17:52
Understanding the LL proof and then more related stuff following it Raman Math 4 2012-05-24 05:37
LL Test: Understanding the math zacariaz Homework Help 32 2007-05-16 15:18
Benchmark page - suspected bug. drew Software 6 2006-07-05 17:48

All times are UTC. The time now is 23:04.

Wed May 5 23:04:25 UTC 2021 up 27 days, 17:45, 0 users, load averages: 1.71, 1.72, 1.77

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.