mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2023-01-11, 18:39   #2894
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

5×172 Posts
Default

Quote:
Originally Posted by kracker View Post
Out of the loop, but is stage 2 of P1 still doable by gpuowl or is Prime95 required to do that stage?
The most recent version, which added P-1 export to mprime, does not do stage2 at all anymore. The reason being that mprime (with some RAM) is so much more efficient at stage2 than gpuowl.
preda is offline   Reply With Quote
Old 2023-01-14, 00:38   #2895
moebius
 
moebius's Avatar
 
Jul 2009
Germany

11×61 Posts
Default

gpuOwl benchmarks online

update
column: estimated runtime in hours

hypothetical values:
Quadro RTX 6000 Ada-AD102
Quadro RTX 5500-Ga102
Quadro RTX 4500-Ga102

Last fiddled with by moebius on 2023-01-14 at 00:38
moebius is offline   Reply With Quote
Old 2023-01-18, 19:01   #2896
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

855710 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
Okay, let's try this a second time.
We ran the efficiency numbers for our new 6950XT.

The attached table shows the energy efficiency for this card. We have modified the fan curve to run at 1% fan speed per °C. (So if the card is 46 °C the fans run at 46%.) The PRP work unit we tested is in the 114.9M range. "HS" = GPU hot spot. Our "$/kWh" is roughly 10¢.

Click image for larger version

Name:	6950XT.png
Views:	57
Size:	53.2 KB
ID:	27933

Xyzzy is offline   Reply With Quote
Old 2023-01-18, 22:23   #2897
moebius
 
moebius's Avatar
 
Jul 2009
Germany

11×61 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
We ran the efficiency numbers for our new 6950XT.

The attached table shows the energy efficiency for this card. We have modified the fan curve to run at 1% fan speed per °C. (So if the card is 46 °C the fans run at 46%.) The PRP work unit we tested is in the 114.9M range. "HS" = GPU hot spot. Our "$/kWh" is roughly 10¢.

Attachment 27933

I need a benchmark for -prp 77936867.
moebius is offline   Reply With Quote
Old 2023-01-19, 01:22   #2898
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

43·199 Posts
Default

Quote:
Originally Posted by moebius View Post
I need a benchmark for -prp 77936867.
77936867:
Code:
20230118 19:11:15 gfx1030-0 77936867 OK       800   0.00% 1579c241dc63eca6  580 us/it + check 0.26s + save 0.09s; ETA 12:33
20230118 19:11:20 gfx1030-0 77936867     10000 fc4f135f7cf4ad29  584
20230118 19:11:26 gfx1030-0 77936867     20000 3cd1bd9d5e09cbc5  586
20230118 19:11:32 gfx1030-0 77936867     30000 c4e0ff35e3290d98  588
20230118 19:11:38 gfx1030-0 77936867     40000 dffe1b1b0d748128  589
20230118 19:11:43 gfx1030-0 77936867     50000 52e286945371ed29  590
20230118 19:11:49 gfx1030-0 77936867     60000 0945da4dc08bdd95  590
20230118 19:11:55 gfx1030-0 77936867     70000 7131fa4eb77f4bb2  591
20230118 19:12:01 gfx1030-0 77936867     80000 8d76071d27ee4221  591
20230118 19:12:07 gfx1030-0 77936867     90000 0bacff453b2f470e  593
20230118 19:12:13 gfx1030-0 77936867 Stopping, please wait..
20230118 19:12:13 gfx1030-0 77936867 OK    100000   0.13% 6d7296b9e2830f50  594 us/it + check 0.26s + save 0.09s; ETA 12:50
116085643:
Code:
20230118 19:16:47 gfx1030-0 116085643 OK       800   0.00% 21ada0de9c2ec1c2  880 us/it + check 0.39s + save 0.13s; ETA 1d 04:22
20230118 19:16:55 gfx1030-0 116085643     10000 8da58707e2590f7d  883
20230118 19:17:04 gfx1030-0 116085643     20000 4150234a796ca594  887
20230118 19:17:13 gfx1030-0 116085643     30000 c6d23cd4b6e53a5e  889
20230118 19:17:22 gfx1030-0 116085643     40000 dec36e69af1d4286  890
20230118 19:17:31 gfx1030-0 116085643     50000 86ac9aa769a6e695  894
20230118 19:17:40 gfx1030-0 116085643     60000 b2b4143077687086  896
20230118 19:17:49 gfx1030-0 116085643     70000 9b5413f07524a278  897
20230118 19:17:58 gfx1030-0 116085643     80000 9d19ac27034804f6  899
20230118 19:18:07 gfx1030-0 116085643     90000 3af85c603ca2e3d9  899
20230118 19:18:16 gfx1030-0 116085643 Stopping, please wait..
20230118 19:18:16 gfx1030-0 116085643 OK    100000   0.09% 13383f413d3c8532  902 us/it + check 0.39s + save 0.13s; ETA 1d 05:04
Xyzzy is offline   Reply With Quote
Old 2023-01-19, 10:15   #2899
moebius
 
moebius's Avatar
 
Jul 2009
Germany

67110 Posts
Default

Asus Vega 64 with gpuOwl v6.11-364 and V7.2-129 additionally Exponent
https://mersenneforum.org/showpost.p...3&postcount=56

Last fiddled with by moebius on 2023-01-19 at 10:16
moebius is offline   Reply With Quote
Old 2023-01-29, 16:01   #2900
Stef42
 
Feb 2012
the Netherlands

4416 Posts
Default

Gave v7.2-129 a go on some low exponent Pm1, with a Radeon VII on Windows. Using prime95 v30.8 b17 for P2.
Gpuowl would do P1, whilst prime95 would do P2.

What I noticed:
  1. A worktodo.add and m<exponent> file are copied to the prime95 folder, however prime95 does not notice.
  2. When restarting prime95 it sees P1 is done, however ignores it. Simply goes P1+P2 with its own chosen bounds.
  3. As mentioned in this thread each time when prime95 is done with work, it switches NoMoreWork and UsePrimenet (bug?).
  4. Resetting these from 0 to 1 makes prime95 think I "quit GIMPS", however does no longer create AIDs for manual assignments. Weird.

Do I need to use a particular prime95 version or something?
The guides on p262 were most helpful and easy to follow, however it doesn't seem to play nice together.
Stef42 is offline   Reply With Quote
Old 2023-01-29, 17:10   #2901
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

26458 Posts
Default

Quote:
Originally Posted by Stef42 View Post
  1. A worktodo.add and m<exponent> file are copied to the prime95 folder, however prime95 does not notice.
  2. When restarting prime95 it sees P1 is done, however ignores it. Simply goes P1+P2 with its own chosen bounds.
  3. As mentioned in this thread each time when prime95 is done with work, it switches NoMoreWork and UsePrimenet (bug?).
  4. Resetting these from 0 to 1 makes prime95 think I "quit GIMPS", however does no longer create AIDs for manual assignments. Weird.
Could you post the log from prime95 when it "sees P1 is done, but goes P1+P2 with its own chosen bounds"? That log may shed light on what's happening at that point.

It may take some time until prime95 notices the new worktodo.add in its folder, I don't know exactly how often it checks -- let's say on the order of 30minutes. But stopping and restarting prime95 would be a sure way for it to integrate the worktodo.add, as you say.

(I assume you run on Windows, and I don't know exactly how to get the prime95 log on Win, but on Linux I use "mprime -d" which outputs detailed log)
preda is offline   Reply With Quote
Old 2023-01-29, 18:49   #2902
Stef42
 
Feb 2012
the Netherlands

22·17 Posts
Default

Quote:
Originally Posted by preda View Post
Could you post the log from prime95 when it "sees P1 is done, but goes P1+P2 with its own chosen bounds"? That log may shed light on what's happening at that point.

It may take some time until prime95 notices the new worktodo.add in its folder, I don't know exactly how often it checks -- let's say on the order of 30minutes. But stopping and restarting prime95 would be a sure way for it to integrate the worktodo.add, as you say.

(I assume you run on Windows, and I don't know exactly how to get the prime95 log on Win, but on Linux I use "mprime -d" which outputs detailed log)
For this example I used this worktodo.txt for gpuowl-win.exe to start:

Code:
Pfactor=1,2,4869703,-1,65,10
My flags used in the start.bat file (used a computer with NVIDIA instead of AMD gpu for a sec):

Code:
-user Stef42 -cpu GTX1080Ti -proof 9 -yield -autoverify 8 -noclean -d 0 -maxAlloc 8G -mprimeDir C:\Users\Stef42\Downloads\prime95
I let gpuowl select B1, which it set at 2000000.
Let it finish. I see worktodo.add and mxxx file moved to Prime95 folder. Restarted prime95.

I highlighted the - I believe - relevant part below. Am I seeing things or does prime95 interpret the file as B2 instead of B1 already done?

Prime95 part:

Code:
[Jan 29 19:11] Worker starting
[Jan 29 19:11] Setting affinity to run worker on CPU core #1
[Jan 29 19:11] Optimal P-1 factoring of M4869703 using up to 10240MB of memory.
[Jan 29 19:11] Assuming no factors below 2^65 and 10 primality tests saved if a factor is found.
[Jan 29 19:11] Optimal bounds are B1=392000, B2=456527000
[Jan 29 19:11] Chance of finding a factor is an estimated 10.5%
[Jan 29 19:11]
[Jan 29 19:11] Using FMA3 FFT length 256K, Pass1=1K, Pass2=256, clm 2, 4 threads
[Jan 29 19:11] Setting affinity to run helper thread 1 on CPU core #2
[Jan 29 19:11] Setting affinity to run helper thread 2 on CPU core #3
[Jan 29 19:11] Setting affinity to run helper thread 3 on CPU core #4
[Jan 29 19:11] Resuming P-1 in stage 2 with B2 from 2000000 to 456527000
[Jan 29 19:11] Inversion of stage 1 result complete, 5 transforms, 1 modular inverse. Time: 0.786 sec.
[Jan 29 19:11] Available memory is 10240MB.
[Jan 29 19:11] Setting affinity to run helper thread 1 on CPU core #2
[Jan 29 19:11] Setting affinity to run helper thread 2 on CPU core #3
[Jan 29 19:11] Setting affinity to run helper thread 3 on CPU core #4
[Jan 29 19:11] Swithcing to FMA3 FFT length 280K, Pass1=896, Pass2=320, clm=2, 4 threads
[Jan 29 19:11] Estimated stage 2 vs. stage 1 runtime ration: 0.222
[Jan 29 19:11] Using 10239MB of memory. D: 8778m 1080x3730 polynomial multiplication.
[Jan 29 19:11] Setting affinity to run polymult helper thread 1 on CPU core #2
[Jan 29 19:11] Setting affinity to run polymult helper thread 2 on CPU core #3
[Jan 29 19:11] Setting affinity to run polymult helper thread 3 on CPU core #4
[Jan 29 19:11] Stage 2 init complete.34045 transforms. Time: 19.331 sec.
[Jan 29 19:13] M4869703 stage 2 complete. 347755 transforms. Total time: 99.975 sec.
[Jan 29 19:13] Starting stage 2 GCD - please be patient.
[Jan 29 19:13] Stage 2 GCD complete. Time: 0.514 sec.
[Jan 29 19:13] M4869703 completed P-1, B1=392000, B2=465119886, Wi4: xxx

Last fiddled with by Stef42 on 2023-01-29 at 18:50
Stef42 is offline   Reply With Quote
Old 2023-01-29, 19:08   #2903
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·3·1,229 Posts
Default

Quote:
Originally Posted by Stef42 View Post
[Jan 29 19:11] Resuming P-1 in stage 2 with B2 from 2000000 to 456527000
[Jan 29 19:11] Inversion of stage 1 result complete, 5 transforms, 1 modular inverse. Time: 0.786 sec.
[Jan 29 19:11] Available memory is 10240MB.
[Jan 29 19:11] Setting affinity to run helper thread 1 on CPU core #2
[Jan 29 19:11] Setting affinity to run helper thread 2 on CPU core #3
[Jan 29 19:11] Setting affinity to run helper thread 3 on CPU core #4
[Jan 29 19:11] Swithcing to FMA3 FFT length 280K, Pass1=896, Pass2=320, clm=2, 4 threads
[Jan 29 19:11] Estimated stage 2 vs. stage 1 runtime ration: 0.222
...
[Jan 29 19:13] M4869703 completed P-1, B1=392000, B2=465119886, Wi4: xxx[/CODE]
Wow that is a quick run on prime95. Some experimentation would be quick.
I don't think the B2 from 2000000 to 456527000 is a problem, but a sign of it running the intended stage 2. Stage 2 needs to cover the primes from B1 to B2. (You could reduce iterations between screen outputs to get some progress reports along the way to confirm what it's starting at.)
"Swithcing" fails spell check.
Reporting a smaller B1 is a problem, especially if it would also do that in results to report. Check whether the results.json.txt also has the too-low 392000 value.
(Can't tell from here because 4869703 does not show a comparable P-1 result.)

Last fiddled with by kriesel on 2023-01-29 at 19:10
kriesel is offline   Reply With Quote
Old 2023-01-29, 19:13   #2904
Stef42
 
Feb 2012
the Netherlands

22×17 Posts
Default

Quote:
Originally Posted by kriesel View Post
Wow that is a quick run on prime95. Some experimentation would be quick.
I don't think the B2 from 2000000 to 456527000 is a problem, but a sign of it running the intended stage 2. Stage 2 needs to cover the primes from B1 to B2. (You could reduce iterations between screen outputs to get some progress reports along the way to confirm what it's starting at.)
"Swithcing" fails spell check.
Reporting a smaller B1 is a problem, especially if it would also do that in results to report. Check whether the results.json.txt also has the too-low 392000 value.
(Can't tell from here because 4869703 does not show a comparable P-1 result.)
I needed to type everything over from the GUI, as a tiny typo was made :)
I also left out some inbetween screen prints of progress, as they aren't interesting for this issue currently.

As you mentioned I'll include the first print of progress:

Code:
[Jan 29 19:12] M4869703 stage 2 at B2=106801926 [3.70%]
I assume you are right, however still curious why Prime95 would ignore the B1 done by gpuowl....

The output in results also mentiones the 392000. I haven't submitted, because useless in this state.

Code:
{"status":"NF", "exponent":4869703, "worktype":"P-1", "b1":392000, "b2":465119886, "d":8778, "poly1-size":1080, "poly2-size":3730, "fft-length":286720, "security-code":"65A59996", "program":{"name":"Prime95", "version":"30.8", "build":17, "port":4}, "timestamp":"2023-01-29 18:13:36", "user":"Stef42", "computer":"manual-pm1"}

Last fiddled with by Stef42 on 2023-01-29 at 19:17
Stef42 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1719 2023-01-16 15:51
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 15:34.


Tue Feb 7 15:34:04 UTC 2023 up 173 days, 13:02, 1 user, load averages: 1.13, 1.19, 1.10

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔