mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Reply
 
Thread Tools
Old 2021-07-22, 00:14   #45
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×11×443 Posts
Default

Quote:
Originally Posted by PhilF View Post
Another SPE?
Yeah. Most likely entirely my fault.

She was (and remains) very smart, but she needed attention I simply didn't have the time to supply (I presumed we could both keep ourselves productively busy).

I can laugh about it now. It almost killed me at the time.
chalsall is offline   Reply With Quote
Old 2021-07-22, 00:21   #46
Viliam Furik
 
"Viliam Furík"
Jul 2018
Martin, Slovakia

24·37 Posts
Default

Quote:
Originally Posted by chalsall View Post
Yes. I was hinting that perhaps those who have access to the code, and the data, might take a look...

Please trust me on this... You don't want LaurV to point out your SPEs...
Oh, so it was not addressed to me. Ok, then.

My guess as to why this mistake happens (assuming I entirely trust your observation of the numbers not changing, even though they should) is that somewhere in the code is a series of possibly nested if statements, which are not arranged properly and cause this situation to happen.
Viliam Furik is offline   Reply With Quote
Old 2021-07-22, 00:31   #47
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110000100102 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
My guess as to why this mistake happens (assuming I entirely trust your observation of the numbers not changing, even though they should) is that somewhere in the code is a series of possibly nested if statements, which are not arranged properly and cause this situation to happen.
My apologies for sharing my sorry life history.

Getting back on topic... This report shows many candidates which need a P-1 in 94M and above. These are not being handed out by Primenet automatically.

I (and others) have been investing some compute to try to clear them. But we don't actually understand the criteria, and the work we've been doing does not seem to be having an impact.

I hope and trust that is clear.
chalsall is offline   Reply With Quote
Old 2021-07-22, 00:43   #48
Viliam Furik
 
"Viliam Furík"
Jul 2018
Martin, Slovakia

24×37 Posts
Default

Quote:
Originally Posted by chalsall View Post
My apologies for sharing my sorry life history.
It was interesting reading in fact.

Quote:
Originally Posted by chalsall View Post
Getting back on topic... This report shows many candidates which need a P-1 in 94M and above. These are not being handed out by Primenet automatically.

I (and others) have been investing some compute to try to clear them. But we don't actually understand the criteria, and the work we've been doing does not seem to be having an impact.

I hope and trust that is clear.
Yep, crystal clear. I only said that I don't fully trust the observation because I didn't do it myself. 99%, but still not 100%. If we include philosophical reasoning, it's not 100% even if I saw it happen myself (or didn't, in this case).
Viliam Furik is offline   Reply With Quote
Old 2021-07-22, 02:31   #49
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

100101101000012 Posts
Default

Quote:
Originally Posted by chalsall View Post
My OCD doesn't like having values in that column before the Cat 0 wavefront...
+1. Same here!

Quote:
Originally Posted by chalsall View Post
drkirkby seems to be fixated on his sole instance which has a lot of memory.
What he fails to understand is some of us actually have scores of machines. He is, actually, less than even interesting.
+2. Or +3...

Quote:
Originally Posted by PhilF View Post
Another SPE?
Brilliant! Thanks for the good laugh.

Quote:
Originally Posted by chalsall View Post
It almost killed me at the time.
Poor you!

What does not kill you, makes you stronger.

Back on topic, higher bounds of P-1 (because the server still thinks there are 2 LL tests saved when a factor is found, as Ken said), in my opinion, won't hurt anybody. I think they are beneficial on long term, and should be kept. Or well, at least for a while, as we progress higher in the exponents' list, the P-1 becomes harder, and TF becomes easier. What I think it is more important that we give a thought in the future is doing P-1 before the last TF bit. My recent work in "two k" project convinced me even more of the utility (and efficiency) of doing more P-1 in detriment of TF (which is what RDS said years ago). (well, now there may be an entire argument about the fact that in "two k" project the exponents are much lower, therefore the P-1 gets a lot easier while TF gets a lot harder, but my "feeling" still stays. Of course, I would not mind somebody combating me - or supporting me - with numbers).


Edit: see also this thread which just popped up right now with a new post. In fact, the two threads may be merged in the future, as they mainly debate the same things.

Last fiddled with by LaurV on 2021-07-22 at 02:37
LaurV is online now   Reply With Quote
Old 2021-07-22, 13:47   #50
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

6168 Posts
Default

Quote:
Originally Posted by chalsall View Post
drkirkby seems to be fixated on his sole instance which has a lot of memory.
Year, that's why I did the tests down to 128 MB. I can assure you, I read your childish comment last night, and didn't lose any sleep over it.
drkirkby is offline   Reply With Quote
Old 2021-07-22, 14:56   #51
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

18E16 Posts
Default

Quote:
Originally Posted by kriesel View Post
Bounds would be somewhat higher for the normal case of tests_saved =2 which is what PrimeNet issues.
Estimated or actual stage run times for those different ram settings posted above would be useful info.
As would varying ram for tests_saved=2.
Unfortunately, those tests would take an awful lot of time to do. I would have to repeat them all doing stage 1, then enough of stage 2 to get an estimate of the time to complete. I don't know how long that would take, but I suspect quite a considerable time.

Since I had not completed P-1 factoring of M105216541 properly, I noticed that the results submitted were for only stage 1, so would be woefully inadequate - the subject of the thread. I'm now going to do the stage 1 and stage 2 for saved tests of 1 and 2, but I'm reluctant to repeat for every RAM combination. I'm currently doing so with maximum RAM. If you want to suggest a particular limit I will repeat for that, but I'm not going to do every combination again, as it will be too time consuming. (I'm hoping the spoiler avoids childish comments).
Quote:
Originally Posted by kriesel View Post
Perhaps that beyond 16-64GB/worker there's little or no point to more ram at ~105M exponent. TB/3 might be handy near 1G.
Yes agreed, 16 to 64 GB/worker seems in the optimal range. What do you mean by 1G - exponents around 10^9? Short of some major breakthrough, we will not get there in my lifetime.

The other thing that is apparent from the data is that 128 and 256 MB don't allow stage 2 to run, so both have the same bounds (B1=434000 and B2=434000) and only a 1.19% chance of finding a factor. In contrast, 512 MB allows stage 2 to run, which more than doubles the chance of finding a factor to 2.17%. I wonder if George would consider increasing the default P-1/P+1/ECM stage 2 memory up to 0.5 GB from 0.3 GB, which should benefit anyone that does not change the default RAM from 0.3 GB. In this day and age, 512 MB is very little RAM.
drkirkby is offline   Reply With Quote
Old 2021-07-22, 16:38   #52
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·5·72·11 Posts
Default

Quote:
Originally Posted by drkirkby View Post
If you want to suggest a particular limit I will repeat for that, but I'm not going to do every combination again, as it will be too time consuming.
...

What do you mean by 1G - exponents around 10^9? Short of some major breakthrough, we will not get there in my lifetime.
...
I wonder if George would consider increasing the default P-1/P+1/ECM stage 2 memory up to 0.5 GB from 0.3 GB, which should benefit anyone that does not change the default RAM from 0.3 GB.
Re 2 tests-saved bounds determination vs ram availability, I suggest powers of 2 of available ram starting from 256GiB down to ~8GiB. And they don't have to be the identical exponent; adjacent unfactored wavefront exponents that you're going to run anyway preparatory to queued PRP assignments, that would use the same fft length, should be good enough, and all contribute toward getting your assignments done. If different exponents make you uneasy, repeat max ram with the last exponent in the series to determine a bound on how much B1 or B2 might have changed with exponent for constant ram allocation. Eg maxram minexponent, declining ram increasing exponent, minram almostmax exponent, maxram maxexponent, in as-closely-spaced prime exponents as you could readily arrange.

Re 1G, yes ~109 exponent but not greater than. (Nowhere to reserve or report P-1 or PRP for p>109.) These P-1 to GPU72 row bounds can be performed by mprime AVX512 in several days, or by GpuOwl Radeon VII in a few days. They'd probably take ~2 weeks each on your 13core/worker configuration of 8167M; a few days as 1 worker; ~a week on KNL Xeon Phi/16GB MCDRAM. Mprime on FMA3 can handle to ~920M.
PRP tests for ~1G are of order 5.5 months on Radeon VII, a year on KNL Xeon Phi. (Future Mlucas releases will support higher.)

You're correct that reaching that 1G level with the wavefront is beyond our lifetimes. I estimated it to be beyond the lifetime of grandchildren of the children born this year, at our current rate of progress and projected life expectancies.
That does not stop us all from sampling a select very few large exponents though, for software QA or special subprojects (or simultaneously performing both with the same sampling).
That's how I found happened to run a PRP proof on a large enough exponent for George to identify the PrimeNet server issue with processing proof files for exponents over ~596M (SSE2 fft size limit).
And an unverified LL test was run years ago on 999999937.

I'd like to see the default at 1GiB. As installed ram ramps up over time, so do software demands, and it probably should be reconsidered periodically. George is understandably cautious with defaults to avoid creating problems with low-ram systems of new users that might give up rather than reconfigure. We know from experience that some users don't read or apply the documentation if they can avoid it. I can run wavefront P-1 on 2GiB 32-bit laptops, but it takes care and patience and they're doing little else except heat the house in the winter.

Last fiddled with by kriesel on 2021-07-22 at 17:36
kriesel is offline   Reply With Quote
Old 2021-07-22, 19:18   #53
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

150E16 Posts
Default

Ok, tested prime95 v30.6b4, on a dual-Xeon-e5-2670 Win7 system & 128GiB installed, 110GiB allowed to stage 2, no per-worker limits, as-manually-issued 2-tests-saved wavefront P-1 in parallel on all 4 workers, and caught some effects of 1/n for n =1, 2, 3 & 4 during performance of a single stage 2.
Code:
[Jul 22 02:46] Optimal P-1 factoring of M105212827 using up to 112640MB of memory.
[Jul 22 02:46] Assuming no factors below 2^76 and 2 primality tests saved if a factor is found.
[Jul 22 02:46] Optimal bounds are B1=882000, B2=50259000
[Jul 22 02:46] Chance of finding a factor is an estimated 4.61%
[Jul 22 02:46] 
[Jul 22 02:46] Setting affinity to run helper thread 2 on CPU core #7
[Jul 22 02:46] Setting affinity to run helper thread 1 on CPU core #6
[Jul 22 02:46] Using AVX FFT length 5600K, Pass1=896, Pass2=6400, clm=1, 4 threads
[Jul 22 02:46] Setting affinity to run helper thread 3 on CPU core #8
[Jul 22 02:49] M105212827 stage 1 is 0.78% complete. Time: 145.126 sec.
[Jul 22 02:51] M105212827 stage 1 is 1.56% complete. Time: 148.352 sec.
...
[Jul 22 08:01] M105212827 stage 1 is 98.98% complete. Time: 160.584 sec.
[Jul 22 08:03] M105212827 stage 1 is 99.77% complete. Time: 162.177 sec.
[Jul 22 08:04] M105212827 stage 1 complete. 2545832 transforms. Time: 19074.068 sec.
[Jul 22 08:04] Starting stage 1 GCD - please be patient.
[Jul 22 08:05] Stage 1 GCD complete. Time: 50.201 sec.
[Jul 22 08:05] Available memory is 56267MB.
[Jul 22 08:05] D: 630, relative primes: 1261, stage 2 primes: 2945707, pair%=92.87
[Jul 22 08:05] Using 56229MB of memory.
[Jul 22 08:08] Stage 2 init complete. 12553 transforms. Time: 179.032 sec.
[Jul 22 08:12] M105212827 stage 2 is 0.56% complete. Time: 237.876 sec.
[Jul 22 08:13] Restarting worker with new memory settings.
[Jul 22 08:13] Optimal P-1 factoring of M105212827 using up to 112640MB of memory.
[Jul 22 08:13] Assuming no factors below 2^76 and 2 primality tests saved if a factor is found.
[Jul 22 08:13] Optimal bounds are B1=882000, B2=50259000
[Jul 22 08:13] Chance of finding a factor is an estimated 4.61%
[Jul 22 08:13] 
[Jul 22 08:13] Setting affinity to run helper thread 1 on CPU core #6
[Jul 22 08:13] Setting affinity to run helper thread 2 on CPU core #7
[Jul 22 08:13] Using AVX FFT length 5600K, Pass1=896, Pass2=6400, clm=1, 4 threads
[Jul 22 08:13] Setting affinity to run helper thread 3 on CPU core #8
[Jul 22 08:13] Available memory is 37545MB.
[Jul 22 08:13] D: 462, relative primes: 840, stage 2 primes: 2923435, pair%=88.77
[Jul 22 08:13] Using 37520MB of memory.
[Jul 22 08:15] Stage 2 init complete. 7614 transforms. Time: 119.963 sec.
[Jul 22 08:15] M105212827 stage 2 is 12.00% complete.
[Jul 22 08:20] M105212827 stage 2 is 12.52% complete. Time: 253.263 sec.
[Jul 22 08:24] M105212827 stage 2 is 13.05% complete. Time: 242.865 sec.
[Jul 22 08:28] M105212827 stage 2 is 13.58% complete. Time: 240.623 sec.
...
[Jul 22 09:48] M105212827 stage 2 is 24.20% complete. Time: 245.913 sec.
[Jul 22 09:52] M105212827 stage 2 is 24.72% complete. Time: 240.834 sec.
[Jul 22 09:55] Restarting worker with new memory settings.
[Jul 22 09:55] Optimal P-1 factoring of M105212827 using up to 112640MB of memory.
[Jul 22 09:55] Assuming no factors below 2^76 and 2 primality tests saved if a factor is found.
[Jul 22 09:55] Optimal bounds are B1=882000, B2=50259000
[Jul 22 09:55] Chance of finding a factor is an estimated 4.61%
[Jul 22 09:55] 
[Jul 22 09:55] Setting affinity to run helper thread 1 on CPU core #6
[Jul 22 09:55] Setting affinity to run helper thread 2 on CPU core #7
[Jul 22 09:55] Using AVX FFT length 5600K, Pass1=896, Pass2=6400, clm=1, 4 threads
[Jul 22 09:55] Setting affinity to run helper thread 3 on CPU core #8
[Jul 22 09:55] Available memory is 28177MB.
[Jul 22 09:55] D: 420, relative primes: 629, stage 2 primes: 2231401, pair%=90.76
[Jul 22 09:55] Using 28142MB of memory.
[Jul 22 09:57] Stage 2 init complete. 6415 transforms. Time: 88.149 sec.
[Jul 22 09:57] M105212827 stage 2 is 25.15% complete.
[Jul 22 10:00] M105212827 stage 2 is 25.69% complete. Time: 229.273 sec.
[Jul 22 10:04] M105212827 stage 2 is 26.25% complete. Time: 242.008 sec.

Last fiddled with by kriesel on 2021-07-22 at 19:19
kriesel is offline   Reply With Quote
Old 2021-07-22, 23:32   #54
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

18E16 Posts
Default

Quote:
Originally Posted by kriesel View Post
Re 2 tests-saved bounds determination vs ram availability, I suggest powers of 2 of available ram starting from 256GiB down to ~8GiB. And they don't have to be the identical exponent; adjacent unfactored wavefront exponents that you're going to run anyway preparatory to queued PRP assignments, that would use the same fft length, should be good enough, and all contribute toward getting your assignments done. If different exponents make you uneasy, repeat max ram with the last exponent in the series to determine a bound on how much B1 or B2 might have changed with exponent for constant ram allocation. Eg maxram minexponent, declining ram increasing exponent, minram almostmax exponent, maxram maxexponent, in as-closely-spaced prime exponents as you could readily arrange.
My choice of exponent for those tests was a bit unfortunate, as I had several others closer in value. I now have these to test - some of which will be completed quite soon. I'll sort out the closest ones, and see what I can benchmark.

104571359
104758957
104809069
104874347
104881739
104895641
105196813
105204559
105211111
105211597
105212323
105212539
105212563
105213601
105216541
105221359

If necessary I will perform testing with different amounts of RAM on the same exponent, even if it wastes some CPU cycles. Otherwise it will always be bugging me that perhaps the results are skewed by not using the same, or very close exponent.

I had a fair bit to drink last night (it's just gone midnight here in the UK). So I am not really in a great position to analyse much. But for M105216541, with RAM unconstrained, I got the following results. All times are local times, so one hour ahead of GMT/UTC.
  1. Based on saving 1 primality test. B1=434000, B2=21339000. Chance of finding a factor is an estimated 3.60%. Started 15:15. Finished 16:57. Runtime = 1 hour, 42 minutes = 102 minutes. Used 207872 MB (203 GB) RAM. 9.0812 GHz days credit.
  2. Based on saving 2 primality tests. B1=889000, B2=52784000. Chance of finding a factor is an estimated 4.66%. Started 1701. Finished 22:16. Runtime = 5 hours, 15 minutes = 315 minutes. Used 311330 MB (304 GB) RAM. 21.4559 GHz days credit.
The ratio of runtimes (315/102=3.08824:1) is a lot more than the ratio of GHz days credits (21.4559/9.0812=2.36267). I'm not bothered by the number of GHz days credit I get, but these differences may well change the optimal value to put for saved_tests.

Quote:
Originally Posted by kriesel View Post
Re 1G, yes ~109 exponent but not greater than. (Nowhere to reserve or report P-1 or PRP for p>109.)

You're correct that reaching that 1G level with the wavefront is beyond our lifetimes. I estimated it to be beyond the lifetime of grandchildren of the children born this year, at our current rate of progress and projected life expectancies.
That does not stop us all from sampling a select very few large exponents though, for software QA or special subprojects (or simultaneously performing both with the same sampling).
I appreciate that, but I am somewhat surprised at the huge number of GHz days spent trial-factoring exponents much larger than the current wavefront.
Quote:
Originally Posted by kriesel View Post
I'd like to see the default at 1GiB. As installed ram ramps up over time, so do software demands, and it probably should be reconsidered periodically. George is understandably cautious with defaults to avoid creating problems with low-ram systems of new users that might give up rather than reconfigure. We know from experience that some users don't read or apply the documentation if they can avoid it. I can run wavefront P-1 on 2GiB 32-bit laptops, but it takes care and patience and they're doing little else except heat the house in the winter.
Yes, 1 GB does seem sensible - perhaps reduced if the computer has little RAM.
drkirkby is offline   Reply With Quote
Old 2021-07-22, 23:36   #55
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

2×199 Posts
Default

Quote:
Originally Posted by kriesel View Post
Re 2 tests-saved bounds determination vs ram availability, I suggest powers of 2 of available ram starting from 256GiB down to ~8GiB. And they don't have to be the identical exponent; adjacent unfactored wavefront exponents that you're going to run anyway preparatory to queued PRP assignments, that would use the same fft length, should be good enough, and all contribute toward getting your assignments done. If different exponents make you uneasy, repeat max ram with the last exponent in the series to determine a bound on how much B1 or B2 might have changed with exponent for constant ram allocation. Eg maxram minexponent, declining ram increasing exponent, minram almostmax exponent, maxram maxexponent, in as-closely-spaced prime exponents as you could readily arrange.
My choice of exponent for those tests was a bit unfortunate, as I had several others closer in value. I now have these to test - some of which will be completed quite soon. I'll sort out the closest ones, and see what I can benchmark.

104571359
104758957
104809069
104874347
104881739
104895641
105196813
105204559
105211111
105211597
105212323
105212539
105212563
105213601
105216541
105221359

If necessary I will perform testing with different amounts of RAM on the same exponent, even if it wastes some CPU cycles. Otherwise it will always be bugging me that perhaps the results are skewed by not using the same, or very close exponent.

I had a fair bit to drink last night (it's just gone midnight here in the UK). So I am not really in a great position to analyse much. But for M105216541, with RAM unconstrained, I got the following results. All times are local times, so one hour ahead of GMT/UTC.
  1. Based on saving 1 primality test. B1=434000, B2=21339000. Chance of finding a factor is an estimated 3.60%. Started 15:15. Finished 16:57. Runtime = 1 hour, 42 minutes = 102 minutes. Used 207872 MB (203 GB) RAM. 9.0812 GHz days credit.
  2. Based on saving 2 primality tests. B1=889000, B2=52784000. Chance of finding a factor is an estimated 4.66%. Started 1701. Finished 22:16. Runtime = 5 hours, 15 minutes = 315 minutes. Used 311330 MB (304 GB) RAM. 21.4559 GHz days credit.
The ratio of runtimes (315/102=3.08824:1) is a lot more than the ratio of GHz days credits (21.4559/9.0812=2.36267). I'm not bothered by the number of GHz days credit I get, but these differences may well change the optimal value to put for saved_tests.

Quote:
Originally Posted by kriesel View Post
Re 1G, yes ~109 exponent but not greater than. (Nowhere to reserve or report P-1 or PRP for p>109.)

You're correct that reaching that 1G level with the wavefront is beyond our lifetimes. I estimated it to be beyond the lifetime of grandchildren of the children born this year, at our current rate of progress and projected life expectancies.
That does not stop us all from sampling a select very few large exponents though, for software QA or special subprojects (or simultaneously performing both with the same sampling).
I appreciate that, but I am somewhat surprised at the huge number of GHz days spent trial-factoring exponents much larger than the current wavefront.
Quote:
Originally Posted by kriesel View Post
I'd like to see the default at 1GiB. As installed ram ramps up over time, so do software demands, and it probably should be reconsidered periodically. George is understandably cautious with defaults to avoid creating problems with low-ram systems of new users that might give up rather than reconfigure. We know from experience that some users don't read or apply the documentation if they can avoid it. I can run wavefront P-1 on 2GiB 32-bit laptops, but it takes care and patience and they're doing little else except heat the house in the winter.
Yes, 1 GB does seem sensible, although that could be reduced if the computer does not have much RAM.
drkirkby is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
P-1 on small exponents markr PrimeNet 18 2009-08-23 17:23
Large small factor Zeta-Flux Factoring 96 2007-05-14 16:59
Problems with Large FFT but not Small FFT's? RichTJ99 Hardware 2 2006-02-08 23:38
Small range with high density of factors hbock Lone Mersenne Hunters 1 2004-03-07 19:51
Small win32 program, range of time to do a TF dsouza123 Programming 1 2003-10-09 16:04

All times are UTC. The time now is 03:19.


Sun Jul 25 03:19:35 UTC 2021 up 1 day, 21:48, 0 users, load averages: 1.74, 2.03, 2.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.