mersenneforum.org  

Go Back   mersenneforum.org > Fun Stuff > Lounge

Reply
 
Thread Tools
Old 2013-05-13, 18:18   #1
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

8,747 Posts
Default Calculating optimal P-1 memory

I searched around a while back and did not see an answer for this...

I know that the more memory that can be given to P-1 stage 2, the better. What I was trying to figure or find, is there a formula or tool somewhere to: calculate how many relative primes P-1 will process for a given set of parameters. On one machine I noticed that if I went from one amount of memory to another the number of primes jumped disproportionately. I would love to find out where the various break points are for various assignments. Then I could tune memory settings to be livable and productive. Also, if I knew that I could add 50 or 100 MB to the night setting and get a boost in the number of primes, that would be great.
Uncwilly is online now   Reply With Quote
Old 2013-05-13, 18:58   #2
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

From http://www.mersennewiki.org/index.ph...yama_extension, it seems to me that the number of relative primes is due to the Brent-Suyama extension e used, but I don't know exactly what memory equates to what e value.
Mini-Geek is offline   Reply With Quote
Old 2013-05-13, 19:51   #3
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

22×3×72×17 Posts
Default

I have been puzzled by this, too. Until recently, I had been allowing P95 to use 27000MB. Rather than doing 480 RP in a single pass, it would mostly do multi-pass on 960 RP. A change in memory usage led me to reduce the P95 allocation to 25500MB. Now it only does 960 RP occasionally, but it is still doing multi-pass at 480 RP. It seems that somewhere in there it would find it possible to do Stage 2 in a single pass.
kladner is offline   Reply With Quote
Old 2013-05-15, 21:42   #4
Mr. P-1
 
Mr. P-1's Avatar
 
Jun 2003

116910 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
That page was taken (with my permission) verbatum or nearly so from a post I made here some months ago. Frankly I'm surprised it hasn't been substantially revised by someone who knows more than I do. I am very far from being the most qualified person to have written it.

Quote:
it seems to me that the number of relative primes is due to the Brent-Suyama extension e used, but I don't know exactly what memory equates to what e value.
Yes and no.

No because, as the wiki page says, the number of relative primes is computed from a different parameter, denoted by d, not from the e value.

Yes because the program tries to optimise the choice of parameters, including d and e, subject to the maximum memory available. For fixed d, a higher value of e requires slightly more memory, so it can happen that there is enough memory available for a higher e, or a larger d, but not both.

Quote:
Originally Posted by kladner View Post
I have been puzzled by this, too. Until recently, I had been allowing P95 to use 27000MB. Rather than doing 480 RP in a single pass, it would mostly do multi-pass on 960 RP.
As is perhaps obvious, the explanation I gave for how the number of relative primes is calculated is a simplification, partly because my purpose was to explain the Brent-Suyama extension, not to go into P95 internals, and partly because, although I know what the program does, I don't understand why.

Specifically what the program does is choose d to be a small primorial (30, 210 or 2310) or a multiple of one of these values smaller than the next primorial. The number of relative primes is then 8, 48, or 480, or the corresponding multiple of one of these.

But I don't understand why it would ever choose a multiple (other than the next primorial up). Apparently if it can manage to do so, then there is a slight improvement in speed, but I cannot for the life of me think of a reason why this should be so.

Quote:
A change in memory usage led me to reduce the P95 allocation to 25500MB. Now it only does 960 RP occasionally, but it is still doing multi-pass at 480 RP. It seems that somewhere in there it would find it possible to do Stage 2 in a single pass.
For exponents in the current range for P-1 assignments, I'm using 12107MB to do P-1 stage 2 in one pass with 432 relative primes, E=12.
Mr. P-1 is offline   Reply With Quote
Old 2013-05-15, 22:10   #5
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32×5×7 Posts
Default

Quote:
Originally Posted by Mr. P-1 View Post
But I don't understand why it would ever choose a multiple (other than the next primorial up). Apparently if it can manage to do so, then there is a slight improvement in speed, but I cannot for the life of me think of a reason why this should be so.
In each pass, it is computing E^(k*d)^e - E^rp^e for all of the relative primes that its processing, then increasing k by 2 and repeating until k*d gets up to b2. It needs 2 * e transforms for each change of k, so larger d results in fewer "change of base" operations per pass. This gain is balanced against the possible increase in the number of passes and the associated initialization costs a larger d might require.

The next primorial gives a rather dramatic increase in the number of relative primes, so sometimes the initialization costs for the increased number of passes outweighs the other advantages of the larger primorial.

Clear as mud?

Last fiddled with by owftheevil on 2013-05-15 at 22:27 Reason: Left out the base
owftheevil is offline   Reply With Quote
Old 2013-05-15, 23:29   #6
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

22·3·72·17 Posts
Default

Thanks to Mr. P-1 and owftheevil for the information. I get at least a vague sense of what's going on. I can see that I left out a key variable in my previous account: the number of HighMem workers. At the moment this is five, which means that any particular run never comes close to having 12 gigabytes. They rarely exceed 6 GB. I do consistently come in at E=12, however.

Even with a total of 32 GB I can't really lock in 12 GB per worker without having to intervene once in a while to let the Stage 2's catch up, and I'm too lazy to mess with things that much.
kladner is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Calculating E based on B1 c10ck3r Math 1 2012-02-22 06:29
Optimal ECM bounds henryzz GMP-ECM 14 2011-06-09 17:04
optimal B1 Mini-Geek Factoring 4 2011-05-27 12:19
Calculating a difficult sum CRGreathouse Math 3 2009-08-25 14:11
optimal memory settings for the P-1 stage S485122 Software 16 2007-05-28 12:08

All times are UTC. The time now is 05:22.

Thu Oct 29 05:22:31 UTC 2020 up 49 days, 2:33, 1 user, load averages: 1.53, 1.47, 1.45

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.