mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2019-11-14, 02:24   #1
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

1CB416 Posts
Default gpuOwl Windows setup for Radeon VII

There are great instructions for running gpuowl on Linux / Radeon VII here: https://www.mersenneforum.org/showthread.php?t=23982

I am now trying to setup a Radeon VII in a Windows 10 box. I thought this would be a breeze, but alas it is a nightmare. Perhaps documenting the struggles here will provide me with the help I need as well as help others in the future.

1) Before beginning I ran Windows update.
2) I installed the card and rebooted. Windows continues to use the Intel on-chip GPU to drive the display.
3) I installed the latest AMD Radeon software 19.11.1
4) clinfo ran and identified the new card.
5) gpuowl 6.7.4-win installed

Alas, running gpuowl-win hangs nearly every time. I did get it to run successfully just once. Timings were horrible. The 32K FFT was getting timings similar to what I get running 5M FFTs on Linux.

Any helpful suggestions?

Last fiddled with by Prime95 on 2019-11-14 at 02:24
Prime95 is offline   Reply With Quote
Old 2019-11-14, 02:31   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

162648 Posts
Default Tuning using Wattman

Awful. No, on second thought, it's worse.

I can see how to change the endpoint of the voltage vs frequency graph to limit the top speed of the GPU.

I see how to change the maximum memory setting to more than 1000 MHz. Yet when I run gpuowl it does not seem to "stick". Memory speed drops back down to 350MHz rather quickly.

Compare this to Linux where you can select from 7 different speed settings, each with their own voltage. You can also specify the memory overclock and fixed fan speed.

Am I missing something?

Last fiddled with by Prime95 on 2019-11-14 at 13:32
Prime95 is offline   Reply With Quote
Old 2019-11-14, 02:45   #3
scan80269
 
"Sam"
Jun 2019
California, USA

2×3×5 Posts
Default

There may be a setting in the motherboard BIOS setup that needs to be tweaked to get the Radeon card to be used as primary graphics adapter. The ability of the BIOS to auto-detect an installed discrete graphics card can vary from board to board. Some boards will auto-disable the Intel integrated graphics when a discrete graphics card is detected, but this "feature" is far from being universal, so your motherboard may well need manual intervention to make the Radeon card primary and keep the Intel integrated graphics disabled.

Once this is achieved, hopefully gpuOwL will run as expected off the Radeon card.
scan80269 is offline   Reply With Quote
Old 2019-11-14, 07:39   #4
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

13718 Posts
Default

One option that will either be a breeze or it won't is GPU passthrough. Normally Linux gamers use Linux host Windows guest for troublesome games (normally to play nice with DRM or anti-cheat), this would be the rare reversal involving GPU-passthrough. I've never dared try it, the basic gist is to keep windows using the intel iGPU, install your Linux of choice on a VM using VM software of your choice, enable AMD virtualisation and cross your fingers. This link has a few tips someone made for doing the reverse that may be applicable, the Vega reset bug sounds familiar but it may be old news: https://forum.level1techs.com/t/ubun...through/139074
M344587487 is offline   Reply With Quote
Old 2019-11-14, 08:17   #5
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

490010 Posts
Default Me too

Quote:
Originally Posted by scan80269 View Post
There may be a setting in the motherboard BIOS setup that needs to be tweaked to get the Radeon card to be used as primary graphics adapter. The ability of the BIOS to auto-detect an installed discrete graphics card can vary from board to board. Some boards will auto-disable the Intel integrated graphics when a discrete graphics card is detected, but this "feature" is far from being universal, so your motherboard may well need manual intervention to make the Radeon card primary and keep the Intel integrated graphics disabled.

Once this is achieved, hopefully gpuOwL will run as expected off the Radeon card.
Getting mine to act as a display was very easy. (No futzing with BIOS at all, and Win10 just figured it out on its own. Perhaps made easier by the fact it's a workstation design with dual Xeons, no IGP present.) Getting gpuowl to run on it though is not easy, not accomplished yet. It's an XFX.
The eventual plan is to have the box dual boot Win10/Ubuntu 18.04. So that when such things arise, I can separate variables and test same hardware and gpuowl commit on either OS.
The odd thing about this setup is I originally bought the system with an NVIDIA card offered as part of the package, rotated a few NVIDIA models through it, and finally switched to AMD beginning with an RX550. That was fine. The Radeon VII is not. Multiple driver versions have been tried. So far I've managed 30 seconds of 756839 PRP with the Radeon VII from a cold startup. It never gets particularly warm, but usually it is a quick hang. Several attempts with 24M P-1 got nowhere. GPU clock rate stays high but no progress lines are output, and memory clock rate goes low. (Use GPU-Z or other monitoring software.)

Something else I've noticed is the opencl compiles when launching gpuowl seem slow, at over 3 seconds on a dual Xeon E5-2697 for small exponents.
So I tried to put linux on it and ran into some obstacles, then tossed mfakto on the Win10 installation, and ran -st:

Selftest statistics
number of tests 34026
successful tests 34026

selftest PASSED!

It runs at ~1300GhzD/day in mfakto. Not what I bought if for though, that was gpuowl.

One other oddity I saw was GPU-Z via Win10 remote desktop reported RX550 sensors ok, but mostly gives 0 values for the Radeon VII. But the RadeonVII is ok without remote desktop.
kriesel is offline   Reply With Quote
Old 2019-11-14, 08:41   #6
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10011001001002 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Awful. No, on second thought, it's worse.

I can see how to change the endpoint of the voltage vs frequency graph to limit the top speed of the GPU.

I see how to change the maximum memory setting to more than 1000 MHz. Yet when I run gpuowl it does not seem to "stick". Memory speed drops back down to 350MHz rather quickly.

Compare this to Linux where you can select from 7 different speed settings, each with there own voltage. You can also specify the memory overclock and fixed fan speed.

Am I missing something?
On mine, gpuowl hangs. Memory clock rate goes back to idle after the hang. Mfakto can still use the card after that. Running now via a TightVNC remote desktop, on which GPU-Z sensors work, I see 350Mhz memory rate when the card is idle, and as low as 21 Mhz for gpu clock rate. Mfakto boosts it up to about 1800Mhz gpu clock and 1000Mhz memory clock. How does yours act if you load it with mfakto?


Code:
2019-11-14 02:24:43 gpuowl v6.11-9-g9ae3189
2019-11-14 02:24:43 Note: no config.txt file found
2019-11-14 02:24:43 config: -user kriesel -cpu roa/radeonvii -use ORIG_X2 -device 0
2019-11-14 02:24:43 worktodo.txt: ";B1=16000,B2=300000;PFactor=0,1,2,24000577,-1,70,2" ignored
2019-11-14 02:24:43 756839 FFT 64K: Width 8x8, Height 64x8; 11.55 bits/word
2019-11-14 02:24:43 using long carry kernels
2019-11-14 02:24:44 OpenCL args "-DEXP=756839u -DWIDTH=64u -DSMALL_HEIGHT=512u -DMIDDLE=1u -DWEIGHT_STEP=0xa.f0aa0ed02db8p-3 -DIWEIGHT_STEP=0xb.b33887f7acp-4 -DWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-3 -DIWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-4 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-11-14 02:24:55 OpenCL compilation in 10721 ms
2019-11-14 02:24:55 756839 OK   408000  53.90%;  123 us/sq; ETA 0d 00:01; b5f53d5eaae113cd (check 0.08s)
2019-11-14 02:25:00 756839      450000  59.45%;  121 us/sq; ETA 0d 00:01; 94bbf35d39f0a90c
2019-11-14 02:25:06 756839 OK   500000  66.05%;  119 us/sq; ETA 0d 00:01; 884d75f60447d260 (check 0.08s)
2019-11-14 02:25:12 756839      550000  72.66%;  118 us/sq; ETA 0d 00:00; 6263fe5d57358c83
2019-11-14 02:25:19 756839      600000  79.26%;  122 us/sq; ETA 0d 00:00; 2557dc3ef4f1eacd
2019-11-14 02:25:25 756839      650000  85.87%;  122 us/sq; ETA 0d 00:00; a855d8743ff911f3
2019-11-14 02:25:31 756839      700000  92.47%;  116 us/sq; ETA 0d 00:00; db9dda732edf0c92
2019-11-14 02:25:37 756839 OK   750000  99.08%;  121 us/sq; ETA 0d 00:00; 447a5283421f241d (check 0.07s)
2019-11-14 02:25:38 PP   756839 / 756839, 0000000000000001
2019-11-14 02:25:38 756839 OK   757000 100.00%;  131 us/sq; ETA 0d 00:00; eb0f7242cae7cf3d (check 0.06s)
2019-11-14 02:25:38 {"exponent":"756839", "worktype":"PRP-3", "status":"P", "program":{"name":"gpuowl", "version":"v6.11-9-g9ae3189"}, "timestamp":"2019-11-14 08:25:38 UTC", "user":"kriesel", "computer":"roa/radeonvii", "aid":"0", "fft-length":65536, "res64":"0000000000000001", "residue-type":1, "errors":{"gerbicz":0}}
2019-11-14 02:25:38 worktodo.txt: ";B1=16000,B2=300000;PFactor=0,1,2,24000577,-1,70,2" ignored
2019-11-14 02:25:38 81182987 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 17.20 bits/word
2019-11-14 02:25:38 OpenCL args "-DEXP=81182987u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -DWEIGHT_STEP=0xd.e1a424e96ffd8p-3 -DIWEIGHT_STEP=0x9.389124b8b5438p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-11-14 02:25:45 OpenCL compilation in 7011 ms
(No further progress in gpuowl in 30 minutes)

Last fiddled with by kriesel on 2019-11-14 at 08:58
kriesel is offline   Reply With Quote
Old 2019-11-14, 10:04   #7
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·52·72 Posts
Default

Quote:
Originally Posted by kriesel View Post
(No further progress in gpuowl in 30 minutes)
And in the Windows System event logs, I note a couple of event 4101, logged in the minute after the last gpuowl output line; "Display driver amdkmdap stopped responding and has successfully recovered." signifying Windows evaluating the display driver as hung and restarting it. Which tends to be bad for whatever gpu application was running at the time. This is even though I had already put the registry entry in place to change from the 2 second default to 16 seconds. Upping it to 32 seconds made no difference.

See https://www.mersenneforum.org/showpo...3&postcount=10

Last fiddled with by kriesel on 2019-11-14 at 10:05
kriesel is offline   Reply With Quote
Old 2019-11-14, 14:44   #8
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

60510 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Awful. No, on second thought, it's worse.

I can see how to change the endpoint of the voltage vs frequency graph to limit the top speed of the GPU.

I see how to change the maximum memory setting to more than 1000 MHz. Yet when I run gpuowl it does not seem to "stick". Memory speed drops back down to 350MHz rather quickly.

Compare this to Linux where you can select from 7 different speed settings, each with their own voltage. You can also specify the memory overclock and fixed fan speed.

Am I missing something?
George,
I, too, am running my Radeon VII on a win 10 system and had a devil of a time getting it to work. Here's what finally worked for me:

I always install MSI Afterburner on my GPU systems to keep tabs on temps & memory and adjust the fans. In this case, it was one of the keys to getting it to run at all since I used it to adjust the power setting to max. Without that, Gpuowl would run for about 10 seconds then the entire system would hang with a black screen.

I also Used Afterburner to crank the fans to 85-90%.

Another thing that almost drove me crazy was I had to start Gpuowl with -d 0 flag even tho it's device 0 ( shown by Gpuowl ) ! The Radeon VII drives my display but my cpu/mobo has a "phantom" graphics device that can't be used ( it's dummied out in the bios ).
None of my Vegas are like that.

Finally, after getting Windows fully updated, I used the Win settings/internet menu to completely disconnect the internet so updates would not float in and reboot my maching.

Every so often, I reconnect to get updates, but on MY terms.

A note: Afterburner's numbering of GPUs is NOT the same as Gpuowl's. Use Gpuowl's numbering for the -d parameter.
I have also noticed Afterburner sometimes has a different numbering for Nvidia also.
Afterburner can not change some settings on AMD that Wattman can.

I am using Gouowl 5.0
tServo is offline   Reply With Quote
Old 2019-11-14, 15:01   #9
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22×52×72 Posts
Default

Quote:
Originally Posted by tServo View Post
I am using Gouowl 5.0
Tservo, thanks for that response. Yes, Windows 10's independent ways about updates are a nuisance. A surefire way to defeat incoming updates for tests is to unplug the network cable from the system (or turn off the wireless radio).
Please give gpuowl V6.11-9 or 6.7-4 a quick trial run on a task with fft size of more than 1M on your system. I find I can run a very short-fft task ok but not usefully large ones. https://www.mersenneforum.org/showpo...postcount=1403 or https://www.mersenneforum.org/showpo...postcount=1343

Also, you may want to upgrade, since v6.2 benchmarked faster than v5.0 in the 87M-115M range (on RX480 that is). https://www.mersenneforum.org/showpo...35&postcount=2

I'm using the command line "gpuowl-win -user kriesel -cpu roa/radeonvii -use ORIG_X2 -device 0" in a batch file, and having the application hang issue on V6.11-9. The console remains usable for other things.

George, a very independent verification of the gpuowl hangs I'm observing is the separate inline wattmeter the system power cord is plugged into. I've had systems where the last gpu in, or if high wattage, the first, won't run due to power limits. Mfakto ruled that out for me in this Radeon VII case, since TF generally takes more power than any of the DP applications. (I have a system where a GTX1080 will run PRP or LL or P-1 but TF on the same GTX1080 takes the system down.) Mfakto is taking the RadeonVII power to 180-240W with frequent fluctuation.

Last fiddled with by kriesel on 2019-11-14 at 15:32
kriesel is offline   Reply With Quote
Old 2019-11-14, 15:43   #10
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

5×112 Posts
Default

Quote:
Originally Posted by kriesel View Post
Tservo, thanks for that response. Yes, Windows 10's independent ways about updates are a nuisance. A surefire way to defeat incoming updates for tests is to unplug the network cable from the system (or turn off the wireless radio).
Please give gpuowl V6.11-9 or 6.7-4 a quick trial run on a task with fft size of more than 1M on your system. I find I can run a very short-fft task ok but not usefully large ones. https://www.mersenneforum.org/showpo...postcount=1403 or https://www.mersenneforum.org/showpo...postcount=1343

Also, you may want to upgrade, since v6.2 benchmarked faster than v5.0 in the 87M-115M range (on RX480 that is). https://www.mersenneforum.org/showpo...35&postcount=2

I'm using the command line "gpuowl-win -user kriesel -cpu roa/radeonvii -use ORIG_X2 -device 0" in a batch file, and having the application hang issue on V6.11-9. The console remains usable for other things.

George, a very independent verification of the gpuowl hangs I'm observing is the separate inline wattmeter the system power cord is plugged into. I've had systems where the last gpu in, or if high wattage, the first, won't run due to power limits. Mfakto ruled that out for me in this Radeon VII case, since TF generally takes more power than any of the DP applications. (I have a system where a GTX1080 will run PRP or LL or P-1 but TF on the same GTX1080 takes the system down.) Mfakto is taking the RadeonVII power to 180-240W with frequent fluctuation.
Kriesel,
Thanks for the executables. I will try them either tonite or tomorrow after I clear up some personal hassles ( &^%$#@!@-ing cars ). ( am especially keen to get Gpuowl to run on Nvidia).

Just curious, did you try the Afterburner power=max ? I believe that is necessary because of the way windoze monitors video device drivers. If an app uses a lot of power, it might thottle a lot and windoze might interpret that as an error and try to recover it, causing a hang.
tServo is offline   Reply With Quote
Old 2019-11-14, 15:56   #11
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2·3·5·101 Posts
Default

Quote:
Originally Posted by tServo View Post
Finally, after getting Windows fully updated, I used the Win settings/internet menu to completely disconnect the internet so updates would not float in and reboot my maching.

Every so often, I reconnect to get updates, but on MY terms.
This solution works for me to avoid reboots but without disconnecting the internet: Create this as a .reg text file and run it to add it to the registry, it disables AutoUpdate as well as disabling reboot with a Logged on User:

Code:
Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Windows\WindowsUpdate\AU]
"AUOptions"=dword:00000002
"NoAutoUpdate"=dword:00000001
"NoAutoRebootWithLoggedOnUsers"=dword:00000001
As an added precaution I also use this to set all my connection to "Metered", then Windows Update will not download updates by itself:

Code:
Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\NetworkList\DefaultMediaCost]
"3G"=dword:00000002
"4G"=dword:00000002
"Default"=dword:00000002
"Ethernet"=dword:00000002
"WiFi"=dword:00000002

When you do check for updates and download them manually, then after reboot check that the registry settings has not been changed by the update, or just run the .reg files again and reboot 1 more time,

Last fiddled with by ATH on 2019-11-14 at 15:58
ATH is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL-specific reference material kriesel kriesel 27 2021-01-13 23:25
gpuowl: runtime error SELROC GpuOwl 59 2020-10-02 03:56
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
gpuowl tuning M344587487 GpuOwl 14 2018-12-29 08:11
How to interface gpuOwl with PrimeNet preda PrimeNet 2 2017-10-07 21:32

All times are UTC. The time now is 08:33.

Thu Feb 25 08:33:20 UTC 2021 up 84 days, 4:44, 0 users, load averages: 1.29, 1.56, 1.61

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.