mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2011-01-06, 15:27   #1
lorgix
 
lorgix's Avatar
 
Sep 2010
Scandinavia

61510 Posts
Default Tweaking RAM & CPU

Hello,

I'm trying to figure out clock speeds, timings and voltages for my RAM, MB & CPU.


Hardware:

ASUS P7H55
i3 540
3.07 -> 3.20GHz
4*2GB Corsair XMS3 1600MHz CL9


CPU & MB:

I'm running the CPU at 160*20 instead of spec 133*23. Vcore is set to Auto, which amounts to 1.13~1.14V in practice. This seems to require ~53W at boot. But then runs at ~50.1W.

I recently installed an additional fan, since then the core temp hasn't passed 58°C. Before that it would reach at least 61°C.

Now; Should I tinker with core voltage, PCH voltage, IMC voltage or PLL voltage? Does the above make sense? Any comments, questions or tips?


RAM:

Spec says 9-9-9-24 @ 1600MHz, 1.65V.
I'm currently trying 8-9-9-23 @ 160*10. Due to the rel. high voltage req. of these modules I have set IMC voltage to 1.15.

Does anyone have any experience with this kind of modules?


Softer stuff:

I mostly use this computer for number crunching, so that should be kept in mind. Which of the above parameters should have the most impact on ECM & P-1 performance? You might say I'm tuning this machine mainly for those two tasks. Where is(/are) the bottleneck(s)?

Is there a simple way of knowing how much memory a specific ECM/P-1 task will require to run optimally? More specifically I'm wondering what the relationship between bounds and number looks like when I'm willing to use say 6GB for a calculation.


GPU?:

Does it make sense to buy a GPU for the sole purpose of factoring? If so; what's the minimum $$$ for that to make sense?

I have no idea how one would go about using a GPU for factoring.

I've heard of GTX 470... Any other recommendations?


Any feedback would be appreciated.
lorgix is offline   Reply With Quote
Old 2011-01-06, 22:38   #2
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by lorgix View Post
Is there a simple way of knowing how much memory a specific ECM/P-1 task will require to run optimally? More specifically I'm wondering what the relationship between bounds and number looks like when I'm willing to use say 6GB for a calculation.
Prime95's readme.txt says this:
Code:
4)  Factor in the information below about minimum, reasonable, and
desirable memory amounts for some sample exponents.  If you choose a
value below the minimum, that is OK.  The program will simply skip
stage 2 of P-1 factoring.

	Exponent	Minimum		Reasonable	Desirable
	--------	-------		----------	---------
	20000000	 40MB		   80MB		 120MB
	33000000	 65MB		  125MB		 185MB
	50000000	 85MB		  170MB		 250MB
The "desirable" isn't the maximum that can be optimally used, it's just a reference for a desirable amount. More can be used and can make it faster and better, but it's probably not as big of a difference.
ECM memory requirements are probably similar given identical bounds and exponent. The "optimal" would probably be that the whole stage 2 can be done in one go, but more realistically, if you allow a few hundred MB it should be good.
Quote:
Originally Posted by lorgix View Post
Does it make sense to buy a GPU for the sole purpose of factoring? If so; what's the minimum $$$ for that to make sense?
For ECM/P-1, not at the moment, because AFAIK there aren't any ECM/P-1 apps made for GPUs yet. They can do other sorts of factoring extremely well compared to modern CPUs though, (AFAIK roughly 10x-100x faster, depending on how good of a GPU and what sort of factoring) and are at least on par with four modern cores on LL tests. The GTX 460 is another good suggestion.
Mini-Geek is offline   Reply With Quote
Old 2011-01-06, 22:41   #3
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Hello,

Quote:
Originally Posted by lorgix View Post
GPU?:

Does it make sense to buy a GPU for the sole purpose of factoring? If so; what's the minimum $$$ for that to make sense?

I have no idea how one would go about using a GPU for factoring.

I've heard of GTX 470... Any other recommendations?
if factoring means trial factoring I won't buy a GPU for factoring on mersennes. (AFAIK there isn't any P-1 or ECM code for GPUs)
The primenet isn't limited by trial factoring at all so it doesn't make much sense. You'll need 2 of your CPU cores to feed the a GTX 470 if you're using mfaktc so you won't be able to run any LLs on CPU. So save your money.

Oliver
TheJudger is offline   Reply With Quote
Old 2011-01-06, 23:19   #4
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

45108 Posts
Default

I think Kleinjung (or was it one of his students) ported GMP-ECM to the PS3. However, the code isn't publicly available, so you might want to ask him about it.
ixfd64 is offline   Reply With Quote
Old 2011-01-07, 09:16   #5
lorgix
 
lorgix's Avatar
 
Sep 2010
Scandinavia

3·5·41 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Prime95's readme.txt says this:
Code:
4)  Factor in the information below about minimum, reasonable, and
desirable memory amounts for some sample exponents.  If you choose a
value below the minimum, that is OK.  The program will simply skip
stage 2 of P-1 factoring.

    Exponent    Minimum        Reasonable    Desirable
    --------    -------        ----------    ---------
    20000000     40MB           80MB         120MB
    33000000     65MB          125MB         185MB
    50000000     85MB          170MB         250MB
The "desirable" isn't the maximum that can be optimally used, it's just a reference for a desirable amount. More can be used and can make it faster and better, but it's probably not as big of a difference.
ECM memory requirements are probably similar given identical bounds and exponent. The "optimal" would probably be that the whole stage 2 can be done in one go, but more realistically, if you allow a few hundred MB it should be good.

For ECM/P-1, not at the moment, because AFAIK there aren't any ECM/P-1 apps made for GPUs yet. They can do other sorts of factoring extremely well compared to modern CPUs though, (AFAIK roughly 10x-100x faster, depending on how good of a GPU and what sort of factoring) and are at least on par with four modern cores on LL tests. The GTX 460 is another good suggestion.
Yes, I'm familiar with those guidelines. Don't know what they are based on though. I mean.. the 'Desirable' amount isn't enough to do stg2 in one run. Which is what I happen to desire. Then again that table has remained unchanged for quite some time now, right?

I'm hoping to find an approximation giving memory needed as a function of [FFT size or exponent] and bounds.

About ~3.5hrs ago I finished a successful P-1 run on M17873291. I did all 480 relative primes (that obv. weren't all relative primes, if my understanding is correct) in one run. It took ~4GB.


I've noticed that the memory use in ECM is much more complex than that in P-1. I would like to know the relationship in the case of ECM to.

From what I've heard TF using GPU('s) is a reality. I'd be interested in that. Not solely for the purpose of GIMPS, but for the purpose of TF.

I don't know what hardware and software it takes though...


Any comments on the rest of my post? Hardware stuff?

Come on people! This is the 'Hardware' forum!

Thanks Mini-Geek.
lorgix is offline   Reply With Quote
Old 2011-01-07, 09:20   #6
lorgix
 
lorgix's Avatar
 
Sep 2010
Scandinavia

3×5×41 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hello,

if factoring means trial factoring I won't buy a GPU for factoring on mersennes. (AFAIK there isn't any P-1 or ECM code for GPUs)
The primenet isn't limited by trial factoring at all so it doesn't make much sense. You'll need 2 of your CPU cores to feed the a GTX 470 if you're using mfaktc so you won't be able to run any LLs on CPU. So save your money.

Oliver
Yes, I'm thinking trial factoring.

You mean that GIMPS is limited by P-1 rather than by TF? That's my understanding. I'd be interested in faster TF anyway. For both GIMPS and other stuff.

I've heard of mfaktc, never used it.

How do you mean feed the GTX? Could you explain?


Thanks!
lorgix is offline   Reply With Quote
Old 2011-01-07, 12:26   #7
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

100010101112 Posts
Default

Hi!

Quote:
Originally Posted by lorgix View Post
How do you mean feed the GTX? Could you explain?
In mfaktc the CPU does the preselection of factor candidates (sieving), a single core of the CPU you've mentioned can't generate these lists of factor candidates fast enough to keep a GTX 470 (in your example) busy all the time.

Oliver
TheJudger is offline   Reply With Quote
Old 2011-01-07, 13:04   #8
lorgix
 
lorgix's Avatar
 
Sep 2010
Scandinavia

26716 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi!



In mfaktc the CPU does the preselection of factor candidates (sieving), a single core of the CPU you've mentioned can't generate these lists of factor candidates fast enough to keep a GTX 470 (in your example) busy all the time.

Oliver
Isn't that pretty much equivalent to the GTX being better at TF than my CPU is? Which is the very reason I'm interested in GPUs to begin with.

Say you want to find the smallest factor of a given number. Should you use a i3 540 or a GTX 470? The latter would be faster, right?
lorgix is offline   Reply With Quote
Old 2011-01-07, 16:00   #9
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Take a look on that page: http://mersenne.org/various/math.php

When talking about mfaktc, the algorithm on that page is running on the GPU while the preselection of candidates (that part which mentions "sieve of Eratosthenes") runs on the CPU.
TheJudger is offline   Reply With Quote
Old 2011-01-07, 16:14   #10
Mr. P-1
 
Mr. P-1's Avatar
 
Jun 2003

7×167 Posts
Default

Quote:
Originally Posted by lorgix View Post
I did all 480 relative primes (that obv. weren't all relative primes, if my understanding is correct) in one run.
Your understanding isn't correct. prime95 does P-1 stage 2 in blocks. Each block has a size which is a multiple of 30 (2*3*5), 210 (2*3*5*7), or 2310 (2*3*5*7*11). the "relative primes" are relative to the chosen blocksize. 480 is indeed all of the relatively prime congruence classes modulo 2310.

Quote:
I mean.. the 'Desirable' amount isn't enough to do stg2 in one run. Which is what I happen to desire.
It's not difficult to extrapolate from known memory usage, to that necessary to do all 480 relative primes in one pass. My default memory setting on my Core 2 Duo is 1370MB, which allows for one core to be doing 60 relative primes per pass on a 2560MB FFT exponent, while the other is doing stage 1. That suggests that about 10GB would be sufficient to do all 480 relative primes on exponents of this size. What I don't know is whether prime95 would chose a larger blocksize with that kind of memory available.

Bear in mind that the per pass overhead is small compared to the overall running time of the algorithm. If you double the number of relative primes per pass, from 20 to 40 say, you'll save X amount of time. Double it again, to 80, and the additional saving is X/2. The returns really do diminish quite rapidly.

For this reason, when specing out a machine for GIMPS it's generally not cost effective to load it with vast amounts of memory. You'd do better to spend the money on a faster processor, faster memory, etc.

I'm not familiar with how prime95 ECM uses memory, but I would imagine that the same principles apply.
Mr. P-1 is offline   Reply With Quote
Old 2011-01-07, 16:43   #11
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

Quote:
Originally Posted by Mr. P-1 View Post
It's not difficult to extrapolate from known memory usage, to that necessary to do all 480 relative primes in one pass. My default memory setting on my Core 2 Duo is 1370MB, which allows for one core to be doing 60 relative primes per pass on a 2560MB FFT exponent, while the other is doing stage 1. That suggests that about 10GB would be sufficient to do all 480 relative primes on exponents of this size. What I don't know is whether prime95 would chose a larger blocksize with that kind of memory available.
You can indeed extrapolate from known memory usage, and the amount is approximately [a constant]*[number of relative primes at a time] + [another constant] (to find the constants for a specific test, do some simple algebra on a couple different allow MB of memory, and look at how many MB and rel. primes at a time Prime95 says it's using). For the number I was testing, the amount needed to do all 480 relative primes in one go would be about 10.83 GB. But yeah, it's not really a good idea to spec out a machine for GIMPS with 12 or more GB just so you can do a whole stage 2 in one go. Even if you're going to have it only do P-1 and it has four cores, 4 GB (or 6 GB to allow some for OS and other tasks while still giving a full gig or more to each test) would be plenty.

Last fiddled with by Mini-Geek on 2011-01-07 at 16:47
Mini-Geek is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Large FFT tweaking Zerowalker Information & Answers 8 2013-04-19 15:01
Tweaking polynomial search for C197 fivemack Msieve 38 2011-07-08 08:12
Tweaking and compiling the Kleinjung siever Batalov Factoring 57 2010-11-30 18:03

All times are UTC. The time now is 09:49.

Sun Apr 11 09:49:36 UTC 2021 up 3 days, 4:30, 1 user, load averages: 1.58, 1.95, 2.09

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.