mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2019-11-14, 16:12   #12
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

47·109 Posts
Default

Quote:
Originally Posted by tServo View Post
Kriesel,
Thanks for the executables. I will try them either tonite or tomorrow after I clear up some personal hassles ( &^%$#@!@-ing cars ). ( am especially keen to get Gpuowl to run on Nvidia).

Just curious, did you try the Afterburner power=max ? I believe that is necessary because of the way windoze monitors video device drivers. If an app uses a lot of power, it might thottle a lot and windoze might interpret that as an error and try to recover it, causing a hang.
You're welcome. Enjoy the increased speed where it's available (on the gpu, but not too much on the road).
Afterburner is not yet installed on the system. Wattman was wiped off in one of the driver changes and has not yet reentered the scene. Mfakto driving the gpu to max power is a nonissue. gpuowl is using ~60W per GPU-Z and some fft lengths hang and some don't in some preliminary testing with -iter 10000 -time of known smallish primes.
kriesel is offline   Reply With Quote
Old 2019-11-14, 17:04   #13
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

27×5 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Awful. No, on second thought, it's worse.

I can see how to change the endpoint of the voltage vs frequency graph to limit the top speed of the GPU.

I see how to change the maximum memory setting to more than 1000 MHz. Yet when I run gpuowl it does not seem to "stick". Memory speed drops back down to 350MHz rather quickly.

Compare this to Linux where you can select from 7 different speed settings, each with their own voltage. You can also specify the memory overclock and fixed fan speed.

Am I missing something?
In trying the latest versions Kriesel kindly provided, I just remembered that I find it necessary to use the -fft +n parameter where n usually = 3.
This is for exponents at the current waverfront. Smaller ones, maybe not so?
tServo is offline   Reply With Quote
Old 2019-11-14, 17:45   #14
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

47×109 Posts
Default

Quote:
Originally Posted by tServo View Post
In trying the latest versions Kriesel kindly provided, I just remembered that I find it necessary to use the -fft +n parameter where n usually = 3.
This is for exponents at the current waverfront. Smaller ones, maybe not so?
Interesting. On gpuowl v6.2 Win7 RX480, +3 was the most likely to generate an EE load error and fail to run. All my Radeon VII attempts to date have been at an implied -fft +0.
kriesel is offline   Reply With Quote
Old 2019-11-14, 19:08   #15
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22·1,873 Posts
Default

I'm still stalled here -- working through the suggestions.

I changed the BIOS to turn off the Intel IGP. I did not expect that to help, but it eliminates a variable.

It seems like Ken and I have very similar issues. Running gpuowl causes a black screen, the driver resets, Windows recovers, but gpuowl is hung. The Windows event viewer says amdkmdap reset and recovered. I've not tried mfakto yet.

I've downloaded MSI afterburner. Any attempt to change settings using that tool is ignored.

I've not tried the GPU passthrough yet. My goal was to run it in Windows rather than go to a dual boot / Virtual Box scenario. This wlll be the last gasp attempt.

One online user of other software reported changing to a beefier power supply helped. Maybe we have power spike issue launching gpuowl. tServo, our only successful Windows gpuowler, uses MSI afterburner to max the power before starting gpuowl???? Ken, can you run gpuowl after you have mfakto started? I've got a 650 watt power supply which should be ample.

Just tried setting HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers TdrLevel to zero. no better gpuowl behavior.

Last fiddled with by Prime95 on 2019-11-14 at 20:05
Prime95 is offline   Reply With Quote
Old 2019-11-14, 19:35   #16
xx005fs
 
"Eric"
Jan 2018
USA

22×53 Posts
Default

I had similar issues with regular Vega cards, and it seems that after a certain version of amd driver, I had to use -carry long instead of short for it to work. It dropped the performance by around 3-5% but at least it works. I don't know if this applies to the Radeon VII, or it's only specific to my use case of mixing nvidia cards with amd cards. I am still using default FFT size chosen by gpuowl.

Last fiddled with by xx005fs on 2019-11-14 at 19:35
xx005fs is offline   Reply With Quote
Old 2019-11-14, 20:38   #17
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

140316 Posts
Default

I believe the issue I am having with the XFX Radeon VII is not a power issue, because:
1) as I recall it did not make a difference whether both Xeons were going full tilt on prime95 or idle, or after removing an RX550 I had installed with it;
2) mfakto uses more power and runs indefinitely, while gpuowl has only run successfully on small exponents for short periods using less power while it runs
3) the power meter at the power cord says wattage is always under 600W total, while the power supply is nominally an 80+ Gold 1120W
4) the issue appears to be fft-length or history dependent or both.

If I change my Windows backup arrangement or pop another drive in, I may be able to get to dual-boot solution for test purposes.

I recently ran the following, in the order shown; results as noted on each line. Each was commented and commented-out to let the next run in its turn, with some gpu idle time between necessarily, because I was doing other things during the several minute waits. Toward the end, mfakto was no longer working either, but in between some of them it was. Mfakto gave an error message Opencl -6, out of host memory a couple of times toward the end.
Code:
;PRP=0,1,2,132049,-1,40,0,3,1 hung
;PRP=0,1,2,756839,-1,44,0,3,1 ok
;PRP=0,1,2,1398259,-1,60,0,3,1 ok
;PRP=0,1,2,2976221,-1,60,0,3,1 hung
;PRP=0,1,2,6972593,-1,60,0,3,1 hung; opencl compile ~9. seconds is last screen output
;PRP=0,1,2,13466917,-1,64,0,3,1 hung before opencl compile complete, returns to command prompt in ~320. seconds.
;PRP=0,1,2,24036583,-1,70,0,3,1 hung before opencl compile complete, returns to command prompt in ~318. seconds.
;PRP=0,1,2,42643801,-1,72,0,3,1 hung before opencl compile complete, returns to command prompt in ~322. seconds.
;PRP=0,1,2,82589933,-1,76,0,3,1 hung before opencl compile complete, returns to command prompt in ~318. seconds.
;PRP=0,1,2,756839,-1,44,0,3,1 not ok this time; see preceding line
System has been shut down and an RX550 2GB added, then restarted. A real disparity.
I'm contemplating activating the command line remote access on this system. Having 3 GUI access paths all down for annoying durations simultaneously is, well, annoying. (If the local display is out, both remote desktop and TIghtVNC are temporarily disabled also. Sometimes when the display image lingers or returns, not sure which, keyboard and mouse appear inactive. The system remains responsive to network traffic as evidenced by ICMP ping reply to other hosts.)

I will continue working this problem a while.
kriesel is offline   Reply With Quote
Old 2019-11-14, 20:45   #18
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

27×5 Posts
Default

I can't get either ver 6.7 or 6.11 to work. The message for both is:
"implicit decl of __asm is invalid in C99"
and then a page of examples of __asm use.

I am using video driver 19.4.1
Is that too old?
I will try a newer version this evening.
I'm skeptical that will solve anything.
tServo is offline   Reply With Quote
Old 2019-11-14, 20:50   #19
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

27·5 Posts
Default

Quote:
Originally Posted by ATH View Post
This solution works for me to avoid reboots but without disconnecting the internet: Create this as a .reg text file and run it to add it to the registry, it disables AutoUpdate as well as disabling reboot with a Logged on User:

Code:
Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Windows\WindowsUpdate\AU]
"AUOptions"=dword:00000002
"NoAutoUpdate"=dword:00000001
"NoAutoRebootWithLoggedOnUsers"=dword:00000001
As an added precaution I also use this to set all my connection to "Metered", then Windows Update will not download updates by itself:

Code:
Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\NetworkList\DefaultMediaCost]
"3G"=dword:00000002
"4G"=dword:00000002
"Default"=dword:00000002
"Ethernet"=dword:00000002
"WiFi"=dword:00000002

When you do check for updates and download them manually, then after reboot check that the registry settings has not been changed by the update, or just run the .reg files again and reboot 1 more time,
ATH,
Thanks for the registry update. I will try it soon.
I originally tried using the "metered connection" setting but after a few months all my
machines starting rebooting due to updates.
I suspect that Microsoft slows down downloads but does not completely stop them so after a while the download has completed and it does the update.
GRRRRRRRRRRRRRRRRRRRRRRR
tServo is offline   Reply With Quote
Old 2019-11-14, 21:06   #20
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11101010001002 Posts
Default

Quote:
Originally Posted by kriesel View Post
I believe the issue I am having with the XFX Radeon VII is not a power issue, .
I'm not suggesting a "total power" issue, rather a "power spiking" issue. I'm trying to enumerate differences between Linux setups and this Windows setup. One may be the way we set memory clock speed and core clock speed. I'm running mfakto selftest now and the core and memory speed is bouncing all over the place. I was under the impression that in Linux the speeds are more stable. I could be wrong. Even if right it may have nothing to do with our problems.

I had a similar issue on a Haswell CPU. I'd get crashes ramping up mprime or ramping down mprime. By disabling C states in the BIOS and running at a constant voltage the crashes went away.
Prime95 is offline   Reply With Quote
Old 2019-11-14, 21:07   #21
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22×1,873 Posts
Default

Quote:
Originally Posted by tServo View Post
I can't get either ver 6.7 or 6.11 to work. The message for both is:
"implicit decl of __asm is invalid in C99"
and then a page of examples of __asm use.

I am using video driver 19.4.1
Is that too old?
I will try a newer version this evening.
I'm skeptical that will solve anything.
add "-use ORIG_X2" to config.txt in the gpuowl directory

Last fiddled with by Prime95 on 2019-11-14 at 21:08
Prime95 is offline   Reply With Quote
Old 2019-11-14, 21:25   #22
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

47·109 Posts
Default

Quote:
Originally Posted by tServo View Post
I can't get either ver 6.7 or 6.11 to work. The message for both is:
"implicit decl of __asm is invalid in C99"
and then a page of examples of __asm use.

I am using video driver 19.4.1
Is that too old?
I will try a newer version this evening.
I'm skeptical that will solve anything.
Sounds familiar. See the -use related posts in the gpuowl thread https://www.mersenneforum.org/showpo...postcount=1212 to about 1222
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL-specific reference material kriesel kriesel 28 2021-03-27 18:40
gpuowl: runtime error SELROC GpuOwl 59 2020-10-02 03:56
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
gpuowl tuning M344587487 GpuOwl 14 2018-12-29 08:11
How to interface gpuOwl with PrimeNet preda PrimeNet 2 2017-10-07 21:32

All times are UTC. The time now is 22:13.

Thu May 13 22:13:58 UTC 2021 up 35 days, 16:54, 0 users, load averages: 3.58, 3.39, 3.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.