mersenneforum.org Tweaking RAM & CPU
 Register FAQ Search Today's Posts Mark Forums Read

 2011-01-06, 15:27 #1 lorgix     Sep 2010 Scandinavia 61510 Posts Tweaking RAM & CPU Hello, I'm trying to figure out clock speeds, timings and voltages for my RAM, MB & CPU. Hardware: ASUS P7H55 i3 540 3.07 -> 3.20GHz 4*2GB Corsair XMS3 1600MHz CL9 CPU & MB: I'm running the CPU at 160*20 instead of spec 133*23. Vcore is set to Auto, which amounts to 1.13~1.14V in practice. This seems to require ~53W at boot. But then runs at ~50.1W. I recently installed an additional fan, since then the core temp hasn't passed 58°C. Before that it would reach at least 61°C. Now; Should I tinker with core voltage, PCH voltage, IMC voltage or PLL voltage? Does the above make sense? Any comments, questions or tips? RAM: Spec says 9-9-9-24 @ 1600MHz, 1.65V. I'm currently trying 8-9-9-23 @ 160*10. Due to the rel. high voltage req. of these modules I have set IMC voltage to 1.15. Does anyone have any experience with this kind of modules? Softer stuff: I mostly use this computer for number crunching, so that should be kept in mind. Which of the above parameters should have the most impact on ECM & P-1 performance? You might say I'm tuning this machine mainly for those two tasks. Where is(/are) the bottleneck(s)? Is there a simple way of knowing how much memory a specific ECM/P-1 task will require to run optimally? More specifically I'm wondering what the relationship between bounds and number looks like when I'm willing to use say 6GB for a calculation. GPU?: Does it make sense to buy a GPU for the sole purpose of factoring? If so; what's the minimum $$for that to make sense? I have no idea how one would go about using a GPU for factoring. I've heard of GTX 470... Any other recommendations? Any feedback would be appreciated. 2011-01-06, 22:38 #2 Mini-Geek Account Deleted "Tim Sorbera" Aug 2006 San Antonio, TX USA 17×251 Posts Quote:  Originally Posted by lorgix Is there a simple way of knowing how much memory a specific ECM/P-1 task will require to run optimally? More specifically I'm wondering what the relationship between bounds and number looks like when I'm willing to use say 6GB for a calculation. Prime95's readme.txt says this: Code: 4) Factor in the information below about minimum, reasonable, and desirable memory amounts for some sample exponents. If you choose a value below the minimum, that is OK. The program will simply skip stage 2 of P-1 factoring. Exponent Minimum Reasonable Desirable -------- ------- ---------- --------- 20000000 40MB 80MB 120MB 33000000 65MB 125MB 185MB 50000000 85MB 170MB 250MB The "desirable" isn't the maximum that can be optimally used, it's just a reference for a desirable amount. More can be used and can make it faster and better, but it's probably not as big of a difference. ECM memory requirements are probably similar given identical bounds and exponent. The "optimal" would probably be that the whole stage 2 can be done in one go, but more realistically, if you allow a few hundred MB it should be good. Quote:  Originally Posted by lorgix Does it make sense to buy a GPU for the sole purpose of factoring? If so; what's the minimum$$$for that to make sense? For ECM/P-1, not at the moment, because AFAIK there aren't any ECM/P-1 apps made for GPUs yet. They can do other sorts of factoring extremely well compared to modern CPUs though, (AFAIK roughly 10x-100x faster, depending on how good of a GPU and what sort of factoring) and are at least on par with four modern cores on LL tests. The GTX 460 is another good suggestion. 2011-01-06, 22:41 #3 TheJudger "Oliver" Mar 2005 Germany 11×101 Posts Hello, Quote:  Originally Posted by lorgix GPU?: Does it make sense to buy a GPU for the sole purpose of factoring? If so; what's the minimum$ for that to make sense? I have no idea how one would go about using a GPU for factoring. I've heard of GTX 470... Any other recommendations?
if factoring means trial factoring I won't buy a GPU for factoring on mersennes. (AFAIK there isn't any P-1 or ECM code for GPUs)
The primenet isn't limited by trial factoring at all so it doesn't make much sense. You'll need 2 of your CPU cores to feed the a GTX 470 if you're using mfaktc so you won't be able to run any LLs on CPU. So save your money.

Oliver

 2011-01-06, 23:19 #4 ixfd64 Bemusing Prompter     "Danny" Dec 2002 California 45108 Posts I think Kleinjung (or was it one of his students) ported GMP-ECM to the PS3. However, the code isn't publicly available, so you might want to ask him about it.
2011-01-07, 09:16   #5
lorgix

Sep 2010
Scandinavia

3·5·41 Posts

Quote:
 Originally Posted by Mini-Geek Prime95's readme.txt says this: Code: 4) Factor in the information below about minimum, reasonable, and desirable memory amounts for some sample exponents. If you choose a value below the minimum, that is OK. The program will simply skip stage 2 of P-1 factoring. Exponent Minimum Reasonable Desirable -------- ------- ---------- --------- 20000000 40MB 80MB 120MB 33000000 65MB 125MB 185MB 50000000 85MB 170MB 250MB The "desirable" isn't the maximum that can be optimally used, it's just a reference for a desirable amount. More can be used and can make it faster and better, but it's probably not as big of a difference. ECM memory requirements are probably similar given identical bounds and exponent. The "optimal" would probably be that the whole stage 2 can be done in one go, but more realistically, if you allow a few hundred MB it should be good. For ECM/P-1, not at the moment, because AFAIK there aren't any ECM/P-1 apps made for GPUs yet. They can do other sorts of factoring extremely well compared to modern CPUs though, (AFAIK roughly 10x-100x faster, depending on how good of a GPU and what sort of factoring) and are at least on par with four modern cores on LL tests. The GTX 460 is another good suggestion.
Yes, I'm familiar with those guidelines. Don't know what they are based on though. I mean.. the 'Desirable' amount isn't enough to do stg2 in one run. Which is what I happen to desire. Then again that table has remained unchanged for quite some time now, right?

I'm hoping to find an approximation giving memory needed as a function of [FFT size or exponent] and bounds.

About ~3.5hrs ago I finished a successful P-1 run on M17873291. I did all 480 relative primes (that obv. weren't all relative primes, if my understanding is correct) in one run. It took ~4GB.

I've noticed that the memory use in ECM is much more complex than that in P-1. I would like to know the relationship in the case of ECM to.

From what I've heard TF using GPU('s) is a reality. I'd be interested in that. Not solely for the purpose of GIMPS, but for the purpose of TF.

I don't know what hardware and software it takes though...

Any comments on the rest of my post? Hardware stuff?

Come on people! This is the 'Hardware' forum!

Thanks Mini-Geek.

2011-01-07, 09:20   #6
lorgix

Sep 2010
Scandinavia

3×5×41 Posts

Quote:
 Originally Posted by TheJudger Hello, if factoring means trial factoring I won't buy a GPU for factoring on mersennes. (AFAIK there isn't any P-1 or ECM code for GPUs) The primenet isn't limited by trial factoring at all so it doesn't make much sense. You'll need 2 of your CPU cores to feed the a GTX 470 if you're using mfaktc so you won't be able to run any LLs on CPU. So save your money. Oliver
Yes, I'm thinking trial factoring.

You mean that GIMPS is limited by P-1 rather than by TF? That's my understanding. I'd be interested in faster TF anyway. For both GIMPS and other stuff.

I've heard of mfaktc, never used it.

How do you mean feed the GTX? Could you explain?

Thanks!

2011-01-07, 12:26   #7
TheJudger

"Oliver"
Mar 2005
Germany

100010101112 Posts

Hi!

Quote:
 Originally Posted by lorgix How do you mean feed the GTX? Could you explain?
In mfaktc the CPU does the preselection of factor candidates (sieving), a single core of the CPU you've mentioned can't generate these lists of factor candidates fast enough to keep a GTX 470 (in your example) busy all the time.

Oliver

2011-01-07, 13:04   #8
lorgix

Sep 2010
Scandinavia

26716 Posts

Quote:
 Originally Posted by TheJudger Hi! In mfaktc the CPU does the preselection of factor candidates (sieving), a single core of the CPU you've mentioned can't generate these lists of factor candidates fast enough to keep a GTX 470 (in your example) busy all the time. Oliver
Isn't that pretty much equivalent to the GTX being better at TF than my CPU is? Which is the very reason I'm interested in GPUs to begin with.

Say you want to find the smallest factor of a given number. Should you use a i3 540 or a GTX 470? The latter would be faster, right?

 2011-01-07, 16:00 #9 TheJudger     "Oliver" Mar 2005 Germany 11·101 Posts Take a look on that page: http://mersenne.org/various/math.php When talking about mfaktc, the algorithm on that page is running on the GPU while the preselection of candidates (that part which mentions "sieve of Eratosthenes") runs on the CPU.
2011-01-07, 16:14   #10
Mr. P-1

Jun 2003

7×167 Posts

Quote:
 Originally Posted by lorgix I did all 480 relative primes (that obv. weren't all relative primes, if my understanding is correct) in one run.
Your understanding isn't correct. prime95 does P-1 stage 2 in blocks. Each block has a size which is a multiple of 30 (2*3*5), 210 (2*3*5*7), or 2310 (2*3*5*7*11). the "relative primes" are relative to the chosen blocksize. 480 is indeed all of the relatively prime congruence classes modulo 2310.

Quote:
 I mean.. the 'Desirable' amount isn't enough to do stg2 in one run. Which is what I happen to desire.
It's not difficult to extrapolate from known memory usage, to that necessary to do all 480 relative primes in one pass. My default memory setting on my Core 2 Duo is 1370MB, which allows for one core to be doing 60 relative primes per pass on a 2560MB FFT exponent, while the other is doing stage 1. That suggests that about 10GB would be sufficient to do all 480 relative primes on exponents of this size. What I don't know is whether prime95 would chose a larger blocksize with that kind of memory available.

Bear in mind that the per pass overhead is small compared to the overall running time of the algorithm. If you double the number of relative primes per pass, from 20 to 40 say, you'll save X amount of time. Double it again, to 80, and the additional saving is X/2. The returns really do diminish quite rapidly.

For this reason, when specing out a machine for GIMPS it's generally not cost effective to load it with vast amounts of memory. You'd do better to spend the money on a faster processor, faster memory, etc.

I'm not familiar with how prime95 ECM uses memory, but I would imagine that the same principles apply.

2011-01-07, 16:43   #11
Mini-Geek
Account Deleted

"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts

Quote:
 Originally Posted by Mr. P-1 It's not difficult to extrapolate from known memory usage, to that necessary to do all 480 relative primes in one pass. My default memory setting on my Core 2 Duo is 1370MB, which allows for one core to be doing 60 relative primes per pass on a 2560MB FFT exponent, while the other is doing stage 1. That suggests that about 10GB would be sufficient to do all 480 relative primes on exponents of this size. What I don't know is whether prime95 would chose a larger blocksize with that kind of memory available.
You can indeed extrapolate from known memory usage, and the amount is approximately [a constant]*[number of relative primes at a time] + [another constant] (to find the constants for a specific test, do some simple algebra on a couple different allow MB of memory, and look at how many MB and rel. primes at a time Prime95 says it's using). For the number I was testing, the amount needed to do all 480 relative primes in one go would be about 10.83 GB. But yeah, it's not really a good idea to spec out a machine for GIMPS with 12 or more GB just so you can do a whole stage 2 in one go. Even if you're going to have it only do P-1 and it has four cores, 4 GB (or 6 GB to allow some for OS and other tasks while still giving a full gig or more to each test) would be plenty.

Last fiddled with by Mini-Geek on 2011-01-07 at 16:47

 Similar Threads Thread Thread Starter Forum Replies Last Post Zerowalker Information & Answers 8 2013-04-19 15:01 fivemack Msieve 38 2011-07-08 08:12 Batalov Factoring 57 2010-11-30 18:03

All times are UTC. The time now is 09:49.

Sun Apr 11 09:49:36 UTC 2021 up 3 days, 4:30, 1 user, load averages: 1.58, 1.95, 2.09