![]() |
![]() |
#1 |
"Antonio Key"
Sep 2011
UK
32·59 Posts |
![]()
Consider the scenario:
I have 2 worker threads on P-1 jobs. I set the maximumum amount of memory to use at 4GB. P95 sets optimum B1 and B2 bounds based on 4GB for each worker thread. When both workers reach stage 2 they end up sharing the memory, so each get about 2GB. At this point the optimum bounds for B1 and B2 are no longer valid. (Yes, I know I could set max number of high mem workers.) What effect does this have on the validity of the result or the processing speed? |
![]() |
![]() |
![]() |
#2 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
10,273 Posts |
![]()
Assuming you have those 4GB, and they are available for P95 at that moment, no effect. Stage 2 will get smaller clusters of primes to test, if more workers fight for that amount of memory, but the global output goes for about the same speed. Think about the scenario when you artificially increase B1/B2 by specifying them in the worktodo file (Pfactor or Pminus1 lines), or by saying that your P-1 saves 5, 10, etc, LL tests. That scenario doesn't "decrease" the output. Of course, the tests will take longer because the limits are higher, increasing your chance to find a factor, but it will take (about) the same amount of time to reach a given limit Bx>B1 in stage 2, no matter if B2 is a bit higher then Bx, or a lot higher then Bx. Your computer can't give more output that it can give, in stage 2, no matter if you get full throttle with one worker or two times half throttle with 2 workers.
Last fiddled with by LaurV on 2012-09-04 at 08:23 |
![]() |
![]() |
![]() |
#3 |
Aug 2002
Termonfeckin, IE
24×173 Posts |
![]()
Use the MaxHighMemWorkers setting. Read undoc.txt.
|
![]() |
![]() |
![]() |
#4 |
"Antonio Key"
Sep 2011
UK
32·59 Posts |
![]()
@garo - I did say I knew about setting high mem workers (I've set it to 1, as it happens
![]() @LaurV - I do have the 4GB to spare, so it does get used by P95. I understand (and have seen) what you are saying. However your response implies that there is nothing really optimum about the choice of B1 and B2, base on the amount of memory available, doesn't it? |
![]() |
![]() |
![]() |
#5 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
101000001000012 Posts |
![]()
Choosing B1 and B2 is a complex process. You need to maximize the chance of finding a factor, but spend as less time as possible doing P-1. This is an "optimization" problem. If you take higher limits, then your chances to find factors increase a lot, but you will spend more time doing P-1, and you will also need "reasonable" amounts of memory (to avoid a lot of overload when clustering the primes in stage 2). Taking larger limits (and using more memory) does not increase/decrease your speed. You do more work, it takes longer. But you don't do "faster" or "slower" work. More memory in stage 2 helps reducing the overload and clustering the primes, and may help for Brent-Suyama extension (increase your chances to find a factor). The most important thing is how much work you would save if you find a factor. If you save no work, it makes no sense to do "deep" P-1 (it may make sense for you, to be the happy guy who found a factor, I not mean you personally, the "generally you", but is not helpful for the project). Of course, you can anytime "cheat" P95 in believing you save a lot of work (use 3,5,etc instead of 1 or 2 "saved LL" in worktodo) or even use manually set limits (as I do for exponents under 100k, you can check and see how I took all of them to B1/B2=10^8/2*10^9 in the last weeks, and found some factor on the way, which did not save any work (?!?) because that range is double LL-ed for ages, but now I am happy I found a factor
![]() Depending on the range you are testing, there may be no difference between 2G or 4G of memory allocated to P-1. BrS is anyhow more or less a shot in the sky and waiting for a duck to fall... There are nice posts somewhere here around from the past, where two guys (I think Dubslow and flashjh) were using the same memory settings and one guy finding a BrS factor, and the other finding it not. Edit: real life example: I have 3 computers with mobile cpu's (like i5-540M, i7-620M etc) which used to do P-1 with 4 workers each (those CPU's are 2 cores with HT and I found out that the HT helps for P-1 on small exponents). All of them have 4GB mem totally, so I could only give 1G to P95. I tried to use more for p95, like 2G, but windows could not survive in the remaining memory and I always got trouble, so I reduced the P95 chunk to 1G. I tried all max high mem workers possibilities: 1, 2, 3, 4. If you have lots of assignments in the worktodo file, then 1 and 2 works better. But when you have few assignments, 3 and 4 works better (of course, with 4 workers doing stage 2, if it happens, they will use about 256M each, and take small clusters of primes for pairing). Why 1 and 2 are (much) worse when you have few assignments? Because stage 1 went on much faster than stage 2, taking more CPU resources (remember, 2 physical cores) and after a while all expos were done with stage 1, but only 1 or 2 workers could do stage 2, so the rest of the workers were waiting. In this situations, you either have to increase B1, or decrease B2, or request more work from the server, or the best idea: give stage 2 green light to all workers. There is always a lot of "fine tuning" you can do. That is how I reached the B2~20*B1 ratio for that range of expos. Last fiddled with by LaurV on 2012-09-04 at 10:37 |
![]() |
![]() |
![]() |
#6 |
"Antonio Key"
Sep 2011
UK
21316 Posts |
![]()
Thanks for taking the time to explain LauV.
![]() Wasn't too bothered by what was happening, just interested if there was any real impact. I suppose my ideal would be to be able to specify the amount of memory per worker in the CPU options (i.e. 2GB per worker, rather than 4GB to P95), that way everything would remain consistant, and the workers wouldn't have to stop processing to share out the available memory (since currently the first worker to start stage 2 uses all the memory until a second worker tries to start stage 2). I just don't want to think about what happens if you have 4, 6, or more cores all fighting for the memory with the current set-up!! |
![]() |
![]() |
![]() |
#7 | |
"Kieren"
Jul 2011
In My Own Galaxy!
2·3·1,693 Posts |
![]() Quote:
You can do this in local.txt. From undoc.txt: Code:
The Memory=n setting in local.txt refers to the total amount of memory the program can use. You can also put this in the [Worker #n] section to place a maximum amount of memory that one particular worker can use. |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Stage 2 Memory Settings | gamer30 | Software | 17 | 2012-08-23 20:02 |
memory isn't freed _AFTER_ P-1 stage 2? | TheJudger | Software | 13 | 2009-08-06 03:30 |
optimal memory settings for the P-1 stage | S485122 | Software | 16 | 2007-05-28 12:08 |
memory usage in P-1 stage 1 | James Heinrich | Software | 5 | 2005-03-22 20:05 |
memory usage in stage 2 of P-1 factoring | gckw | Software | 3 | 2003-09-07 06:56 |