mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2021-11-26, 08:12   #34
lisanderke
 
"Lisander Viaene"
Oct 2020
Belgium

109 Posts
Default

I let it run overnight (only woken up now so I'll stop using 30.8) and I seem to have run into an infinite stage 2 init. (Started init at 5.30am, stopped worker at 9am local time and it still isn't completing to let me stop the worker.) It seems the FFT length changed from 448K during stage 1 to 512K at start of stage 2.
Attached Thumbnails
Click image for larger version

Name:	Screenshot 2021-11-26 090744.png
Views:	94
Size:	82.1 KB
ID:	26139   Click image for larger version

Name:	Screenshot 2021-11-26 091148.png
Views:	92
Size:	33.2 KB
ID:	26140  

Last fiddled with by lisanderke on 2021-11-26 at 08:13
lisanderke is offline   Reply With Quote
Old 2021-11-26, 08:13   #35
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

7×13×17 Posts
Default

My own experience:

Started 30.8 around 23:30 last night, on a 12 M range exponent. I ran only one worker, 4 threads; machine is an i5-7400 @3.2 GHz, 16 GB of RAM (Kaby Lake). Allowed Prime95 to use 13.5 GB of RAM. Bounds were B1=1100000 and B2=110000000. Stage 1 went well, taking just under 20 minutes; at around 75% I stopped and resumed the run without any problem. Then Stage 2 started, gave out the usual message about init phase being complete, I left the PC running and went to bed. This morning, 8 hours later, I found the PC thrashing so hard it would at first seem it had crashed (which turned out not to be the case). There were no new messages on the screen, the last one being the Stage 2 init, that had taken around 20 seconds. The PC wasn´t totally unusable, but thrashing was so hard it took more than 30 seconds to just get to the Task Manager using Ctrl-Alt-Del. Getting there, I found Prime95 was using a bit more than 12GB of memory, which had no reason to be a problem, it´s a figure in line with many other Prime95 runs I´ve done without any issues. The CPU usage was near zero. Now the funny thing is when I stopped, and then exited, P95 from its own window nothing changed in the TM window; in particular the reported mem usage was still the same (~12 GB), and so it remained until I finally killed the process using TM.
There seemed to be some memory management issue, as P95 was "refusing" to release the memory even though its use of the CPU had already gone down to zero.
Then I tried to restart the run; it restarted from a point somewhere in Stage 1, although that stage had properly finished and Stage 2 had (apparently) terminated its init phase. I was expecting a Stage 2 savefile to be present, but that didn´t seem to be the case: the PC had just been busy doing nothing. The restarting point was with stage 1 ~75% complete; it was the point at which I had interrupted and then continued the run.

Just read lisanderke´s post: pretty much the same experience.

Last fiddled with by lycorn on 2021-11-26 at 08:18
lycorn is offline   Reply With Quote
Old 2021-11-26, 09:40   #36
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

3·5·17·19 Posts
Default

Once the stable 30.8 version is released, could it be a good idea to run it against F12-F28 on a high memory machine?
ET_ is offline   Reply With Quote
Old 2021-11-26, 13:11   #37
axn
 
axn's Avatar
 
Jun 2003

28·3·7 Posts
Default

F12-F15 is probably less useful because GMP-ECM would also work. However, probably still worth a shot.

If there is sufficient RAM, it might even work with F29.
axn is offline   Reply With Quote
Old 2021-11-26, 14:18   #38
axn
 
axn's Avatar
 
Jun 2003

28×3×7 Posts
Default

Some scaling results for 30.8b1

CPU: Ryzen R5 3600. Ubuntu 20.04
5 workers running stage 1. 6th worker running Stage 2.
FFT=320K / 384K. No roundoff checking. B1=1.1M. All stage 2 runs were resumed from previously run stage 1 save files.

Code:
Memory = 9.5 GB

B2=100M => D: 4290, 480x2630 B2=100308780 18s+239s
B2=200M => D: 5610, 640x2470 B2=202290990 26s+476s
B2=400M => D: 5610, 640x2470

Memory = 38 GB

B2=400M => D: 21450, 2400x10082 B2=510424200 130s+343s
B2=800M => D: 20790, 2160x10322 B2=810352620 118s+531s
B2=1.6G => D: 20790, 2160x10322


Memory = 57 GB

B2=100M => D: 30030, 2880x15849 B2=308888580 155s+176s
B2=200M => D: 30030, 2880x15849 B2=314774460 155s+177s
B2=400M => D: 30030, 2880x15849 B2=629548920 155s+358s
B2=800M => D: 30030, 2880x15849 B2=956065110 161s+549s
B2=1.6G => D: 30030, 2880x15849 B2=1609127520 156s+887s


Memory = 88 GB (RAM @ 2133 instead of 3600 used for other tests)

B2=100M => D: 43890, 4320x24603 B2=708340710 276s+382s
B2=200M => D: 43890, 4320x24603 B2=716065350 272s+384s
B2=400M => D: 43890, 4320x24603 B2=731426850 272s+380s
B2=800M => D: 43890, 4320x24603 B2=1462853700 271s+756s
B2=1.6G => D: 43890, 4320x24603 B2=2225047440 273s+1139s
An interesting observation: For the 57 GB & 88 GB results, I would have expected it to pick the same B2 for requested B2 = 100M/200M/400M (for a given RAM allocation). Yet there are small differences. Possible bug?

EDIT:- All runs found their respective factors.

Last fiddled with by axn on 2021-11-26 at 14:19
axn is offline   Reply With Quote
Old 2021-11-28, 20:32   #39
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

173338 Posts
Default

Let's try again. It turns out I did not fully understand (and still don't) the roundoff behavior of the new polymult code. Couple that with several issues recovering from excessive roundoff error and bad things happened in build 1. Believe it or not, the roundoff problems boiled down to difficulty calculating one times one.

This build may not fix all previously reported issues, but let's see how it does. This version will print out more roundoff error info which you can safely ignore.

Should you wish to try 30.8, same warnings as before. Links are below.
  • Use this version only for P-1 work on Mersenne numbers. This really is pre-beta!
  • Please rerun your last 3 or 4 successful P-1 runs to QA that the new P-1 stage 2 code finds those factors.
  • Use much more aggressive B2 bounds. While the optimal B2 calculations may not be perfect I recommend using them anyway.
  • Turn on roundoff error checking
  • Give stage 2 as much memory as you can. Only run one worker with high memory. The default value for MaxHighMemWorkers is now one.
  • Save files during P-1 stage 2 cannot be created.
  • There is no progress reporting during P-1 stage 2.
  • P-1 stage 2 is untested on 100M+ exponents. I am not sure the code can accurately gauge when the new code is faster than the old code.
  • AVX-512 is untested -- likely to fail (perhaps silently). Pre-AVX is untested but might work. Recommend using only AVX and FMA FFTs.
  • MaxStage0Prime in undoc.txt has changed.
  • Archive your completed P-1 save files in case there are bugs found that require re-running stage 2.

Windows 64-bit: https://mersenne.org/ftp_root/gimps/p95v308b2.win64.zip
Linux 64-bit: https://mersenne.org/ftp_root/gimps/...linux64.tar.gz
Prime95 is online now   Reply With Quote
Old 2021-11-28, 20:39   #40
techn1ciaN
 
techn1ciaN's Avatar
 
Oct 2021
U. S. / Maine

100100102 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Should you wish to try 30.8 ...
I'm using 30.7 for wavefront P-1 with a "normal" amount of RAM (13 GB allocated). Am I correct in understanding that 30.8's throughput boost for this work will not be large enough to justify upgrading before there is a more stable build available?
techn1ciaN is offline   Reply With Quote
Old 2021-11-28, 20:58   #41
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

244528 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Believe it or not, the roundoff problems boiled down to difficulty calculating one times one.
chalsall is offline   Reply With Quote
Old 2021-11-28, 21:36   #42
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

789910 Posts
Default

Quote:
Originally Posted by techn1ciaN View Post
I'm using 30.7 for wavefront P-1 with a "normal" amount of RAM (13 GB allocated). Am I correct in understanding that 30.8's throughput boost for this work will not be large enough to justify upgrading before there is a more stable build available?
You are correct
Prime95 is online now   Reply With Quote
Old 2021-11-28, 21:38   #43
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

20058 Posts
Default

My test case was two workers. The first had a known factor. The second had some other work:
Code:
[Worker #1]
Pminus1=N/A,1,2,22463209,-1,1000000,324000000,75
[Worker #2]
Pminus1=N/A,1,2,21362113,-1,1000000,32400000,75
Pminus1=N/A,1,2,21362903,-1,1000000,32400000,75
It started normally, but was not stating which B2 it wanted to use. I had a stage 1 file which it used successfully. While stage 2 in worker #1 was running (using 110-115 % of the memory I had allowed it), stage 1 of the first assignment in worker #2 completed and the second assigment was started. After the factor was found, the worktodo entry in worker #1 was removed. It then crashed with error code 0xc0000005 at 0x000000000208b09a.

I tried to start the program again. When entering the worker #2 start (it now tried to start stage 2 of the first assignment of worker #2), it gave a B2 value this time, but crashed again. So I ran it in the debugger and got an error at 0x00007FF7093CB09A in prime95.exe: 0xC0000005: access violation exception reading 0xFFFFFFFFFFFFFFE4.

Last fiddled with by kruoli on 2021-11-28 at 22:04 Reason: Worktodo was incorrect.
kruoli is offline   Reply With Quote
Old 2021-11-28, 22:54   #44
nordi
 
Dec 2016

1708 Posts
Default

I tested 30.8b2 with 11909879, 11936063, 11933137, 11977759, and 11968721 which all produced the expected factors. The exponents used FMA3 FFT length 768K for stage 2 with 4 worker threads.

The automatically chosen B2 was too aggressive (and changed it's mind from 1735*B1 to 1497*B1 which I have not seen before):
Code:
[Work thread Nov 28 23:08] M11977759 stage 1 complete. 1154648 transforms. Total time: 424.667 sec.
...
[Work thread Nov 28 23:08] With trial factoring done to 2^67, optimal B2 is 1735*B1 = 694000000.
[Work thread Nov 28 23:08] If no prior P-1, chance of a new factor is 10.3%
[Work thread Nov 28 23:08] Switching to FMA3 FFT length 768K, Pass1=768, Pass2=1K, clm=1, 4 threads
...
[Work thread Nov 28 23:08] With trial factoring done to 2^67, optimal B2 is 1497*B1 = 598800000.
[Work thread Nov 28 23:08] If no prior P-1, chance of a new factor is 10.1%
[Work thread Nov 28 23:08] Using 27715MB of memory.  D: 8190, 864x3627 polynomial multiplication.
...
[Work thread Nov 28 23:44] M11977759 stage 2 complete. 774575 transforms. Total time: 2038.543 sec.
So stage 2 took five times as long as stage 1!
nordi is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 beta version 28.4 Prime95 Software 20 2014-03-02 02:51
Prime95 beta version 28.3 Prime95 Software 68 2014-02-23 05:42
Prime95 version 27.1 early preview, not-even-close-to-beta release Prime95 Software 126 2012-02-09 16:17
Beta version 24.12 available Prime95 Software 33 2005-06-14 13:19
Beta version of PRP Prime95 PSearch 15 2004-09-17 19:21

All times are UTC. The time now is 03:05.


Tue Jun 28 03:05:28 UTC 2022 up 75 days, 1:06, 1 user, load averages: 1.19, 1.38, 1.36

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔