mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2018-01-31, 21:37   #1
Cubox
 
Dec 2017

52 Posts
Default CUDALucas making GPU not boost

Hi,

I'm seeing a weird glitch with CUDALucas, since I upgraded my motherboard (from MSI B85-G43 to ASUS Z97-A, same CPU, ram, Windows 10 latest build installation, GPU drivers, GPU, etc).

When using CUDALucas on a current mersenne number (45582223 for example), the GPU won't use all of it's power. My normal usage (in games for example) is core 2012Mhz, memory 4314Mhz, voltage 1049mV (this is with a very stable overclock of +101/+512).

CUDALucas only does 1582Mhz core and 718mV. The GPU reports "no load limit".

The weird thing is that if I start up Google Chrome, even an empty tab, the GPU will use all of it's power (and switch to "voltage limit").

This happens with CUDALucas 2.05 and 2.06beta. I can't think of a reason why this is happening.
This is not an impossible situation, since I can get the normal usage by keeping Chrome open, but it's weird.

Any ideas?
Cubox is offline   Reply With Quote
Old 2018-02-01, 05:51   #2
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

7×1,423 Posts
Default

What GPU? (i.e. what DP/SP ratio?)

Your games use (only) a lot of SP calculus (FP16 and FP32), but CL uses a lot of DP calculus (FP64). For the first, you have an army of ants running through your cores, moving the rice grain by grain, for the second you have an army of elephants doing the same amount of work, they are less, but heavier, moving the rice bag by bag. For the second, it is normally to reduce the clocks, for example a Titan (classic) will go from 1.1G to 780M or so, however the power consumption increases.

This is normal. GPUs have different DP/SP ratio, varying from 1/2 to 1/48 or so. Some are better for LL (cudaLucas, FP64) some are better for TF (mfaktX, SP, etc).

What time per iteration do you witness? Does it match the tables for your GPU?

Can you try to run mfaktX? (which also won't use much DP) and see if the clocks "stay"? (The heat, however, will increase).

If so, then there is nothing wrong, your GPU works as expected.

Last fiddled with by LaurV on 2018-02-01 at 05:56 Reason: link
LaurV is offline   Reply With Quote
Old 2018-02-02, 01:41   #3
Cubox
 
Dec 2017

318 Posts
Default

Quote:
Originally Posted by LaurV View Post
What GPU? (i.e. what DP/SP ratio?)

Your games use (only) a lot of SP calculus (FP16 and FP32), but CL uses a lot of DP calculus (FP64). For the first, you have an army of ants running through your cores, moving the rice grain by grain, for the second you have an army of elephants doing the same amount of work, they are less, but heavier, moving the rice bag by bag. For the second, it is normally to reduce the clocks, for example a Titan (classic) will go from 1.1G to 780M or so, however the power consumption increases.

This is normal. GPUs have different DP/SP ratio, varying from 1/2 to 1/48 or so. Some are better for LL (cudaLucas, FP64) some are better for TF (mfaktX, SP, etc).

What time per iteration do you witness? Does it match the tables for your GPU?

Can you try to run mfaktX? (which also won't use much DP) and see if the clocks "stay"? (The heat, however, will increase).

If so, then there is nothing wrong, your GPU works as expected.
I am aware of the difference between SP and DP, it's a GTX 1070.
Thing is that when the card is going at the higher clock, the time per iteration changes on CUDALucas.
See the screenshot included for a quick preview of the CUDALucas output.
Here, when the ms/iter switches from 3.3 to 2.7 is when I launched chrome to check this thread.
This GPU, before the first time this issue appeared (no more than a week ago), never did this (not going at full speed).
I used to do mafaktc on it, with full speed and clocks (and a whole lot more of heat).

This is why I am assuming this is not a ratio issue. The comparaison between "now" and "before" is only related to CUDALucas performance, I used the game example only to know that it's not the whole card that is underclocking/volting.
Attached Thumbnails
Click image for larger version

Name:	Capture.PNG
Views:	144
Size:	49.4 KB
ID:	17636  
Cubox is offline   Reply With Quote
Old 2018-02-02, 01:45   #4
Cubox
 
Dec 2017

1916 Posts
Default

I have included here a screenshot proving that my overclock is indeed stable.
Only the manual testing lines are relevant.
Attached Thumbnails
Click image for larger version

Name:	Capture.PNG
Views:	145
Size:	225.2 KB
ID:	17637  
Cubox is offline   Reply With Quote
Old 2018-02-02, 02:53   #5
Cubox
 
Dec 2017

52 Posts
Default

Here are screenshots of MSI Afterburner, one with chrome closed, and one with chrome opened.
The second screenshot includes CUDALucas output, and various graphs showing the difference in the GPU usage when Chrome is opened or not, with CUDALucas running in the background.

You can see the GPU Clock bumping up, GPU power and voltage as well, and the "limit" switches from "no load" to "voltage" (voltage is normal when the card is used at it's max potential)
Attached Thumbnails
Click image for larger version

Name:	Capture.PNG
Views:	128
Size:	870.2 KB
ID:	17638   Click image for larger version

Name:	Capture2-min.PNG
Views:	160
Size:	514.9 KB
ID:	17639  
Cubox is offline   Reply With Quote
Old 2018-02-02, 03:03   #6
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

111000001002 Posts
Default

Try turning off hardware acceleration in Chrome and see if the GPU acts like Chrome isn't open.
wombatman is offline   Reply With Quote
Old 2018-02-02, 07:14   #7
Cubox
 
Dec 2017

52 Posts
Default

Quote:
Originally Posted by wombatman View Post
Try turning off hardware acceleration in Chrome and see if the GPU acts like Chrome isn't open.
Tried that, and indeed disabling hardware acceleration will keep my GPU at "reduced" clock/voltage speeds.
I am guessing that only CUDALucas is not making the card push as much as it can. I'm looking into CUDA workloads being slower without 3d applications open.

Last fiddled with by Cubox on 2018-02-02 at 07:40
Cubox is offline   Reply With Quote
Old 2018-02-02, 08:05   #8
Cubox
 
Dec 2017

52 Posts
Default

I installed the utility nvdia-smi, in order to get all kind of data for the GPU.
I saw no modification for the following fields between "full speed/chrome opened" and "reduced clock"

Code:
Performance State               : P2
    Clocks Throttle Reasons
        Idle                        : Not Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
            HW Thermal Slowdown     : Not Active
            HW Power Brake Slowdown : Not Active
        Sync Boost                  : Not Active
        SW Thermal Slowdown         : Not Active
However, at the very end, we can see
Code:
    Processes
        Process ID                  : 1728
            Type                    : C+G
            Name                    : C:\Windows\System32\dwm.exe
            Used GPU Memory         : Not available in WDDM driver model
        Process ID                  : 2280
            Type                    : C
            Name                    : O:\Softs\Primes\CUDALucas - 1070\CUDALucas2.06beta-CUDA8.0-Windows-x64.exe
            Used GPU Memory         : Not available in WDDM driver model
        Process ID                  : 7780
            Type                    : C+G
            Name                    : C:\Users\cubox\AppData\Roaming\Spotify\Spotify.exe
            Used GPU Memory         : Not available in WDDM driver model
        Process ID                  : 9868
            Type                    : C+G
            Name                    : C:\Program Files (x86)\Google\Chrome\Application\chrome.exe
            Used GPU Memory         : Not available in WDDM driver model
CUDALucas is only a process of a type "Compute", in contrast to the "Compute+Graphics" that other processes are.

I guess that if I were to make CUDALucas become a C+G, this problem won't occur anymore.
I'm curious what other CUDALucas users are seeing. If you want to check for yourself, download and install https://developer.nvidia.com/cuda-do...ype=exenetwork

Then open a cmd/powershell, go to C:/Program Files/NVDIA/NVSMI and execute ./nvdia-smi -q

My full output:

Code:
==============NVSMI LOG==============

Timestamp                           : Fri Feb 02 08:54:28 2018
Driver Version                      : 388.31

Attached GPUs                       : 1
GPU 00000000:01:00.0
    Product Name                    : GeForce GTX 1070
    Product Brand                   : GeForce
    Display Mode                    : Enabled
    Display Active                  : Enabled
    Persistence Mode                : N/A
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 1920
    Driver Model
        Current                     : WDDM
        Pending                     : WDDM
    Serial Number                   : N/A
    GPU UUID                        : GPU-(Removed by author)
    Minor Number                    : N/A
    VBIOS Version                   : 86.04.26.00.3E
    MultiGPU Board                  : No
    Board ID                        : 0x100
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.01.03
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization mode         : None
    PCI
        Bus                         : 0x01
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1B8110DE
        Bus Id                      : 00000000:01:00.0
        Sub System Id               : 0x33021462
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 3
            Link Width
                Max                 : 16x
                Current             : 8x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays since reset         : 0
        Tx Throughput               : 86000 KB/s
        Rx Throughput               : 106000 KB/s
    Fan Speed                       : 41 %
    Performance State               : P2
    Clocks Throttle Reasons
        Idle                        : Not Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
            HW Thermal Slowdown     : Not Active
            HW Power Brake Slowdown : Not Active
        Sync Boost                  : Not Active
        SW Thermal Slowdown         : Not Active
    FB Memory Usage
        Total                       : 8192 MiB
        Used                        : 737 MiB
        Free                        : 7455 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 229 MiB
        Free                        : 27 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 100 %
        Memory                      : 63 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Encoder Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
        Aggregate
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending                     : N/A
    Temperature
        GPU Current Temp            : 66 C
        GPU Shutdown Temp           : 99 C
        GPU Slowdown Temp           : 96 C
        GPU Max Operating Temp      : N/A
        Memory Current Temp         : N/A
        Memory Max Operating Temp   : N/A
    Power Readings
        Power Management            : Supported
        Power Draw                  : 121.06 W
        Power Limit                 : 230.00 W
        Default Power Limit         : 230.00 W
        Enforced Power Limit        : 230.00 W
        Min Power Limit             : 115.00 W
        Max Power Limit             : 291.00 W
    Clocks
        Graphics                    : 1911 MHz
        SM                          : 1911 MHz
        Memory                      : 3802 MHz
        Video                       : 1708 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 1987 MHz
        SM                          : 1987 MHz
        Memory                      : 4004 MHz
        Video                       : 1708 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes
        Process ID                  : 1728
            Type                    : C+G
            Name                    : C:\Windows\System32\dwm.exe
            Used GPU Memory         : Not available in WDDM driver model
        Process ID                  : 2280
            Type                    : C
            Name                    : O:\Softs\Primes\CUDALucas - 1070\CUDALucas2.06beta-CUDA8.0-Windows-x64.exe
            Used GPU Memory         : Not available in WDDM driver model
        Process ID                  : 7780
            Type                    : C+G
            Name                    : C:\Users\cubox\AppData\Roaming\Spotify\Spotify.exe
            Used GPU Memory         : Not available in WDDM driver model
        Process ID                  : 9868
            Type                    : C+G
            Name                    : C:\Program Files (x86)\Google\Chrome\Application\chrome.exe
            Used GPU Memory         : Not available in WDDM driver model
Cubox is offline   Reply With Quote
Old 2018-02-02, 14:08   #9
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

22·449 Posts
Default

I don't know what brand of GPU you have, but with my EVGA, I can use their PrecisionX GPU utility to force what they call "KBoost", which basically forces the GPU to operate at maximum clock at all times. They note that it can be specifically helpful in benchmarking, which would suggest it should help you with CudaLucas.
wombatman is offline   Reply With Quote
Old 2018-02-03, 03:33   #10
Cubox
 
Dec 2017

52 Posts
Default

After removing my overclock, I realised that the Clock speed the GPU is stuck is the same as an stock non overclocked, non Turbo boosting 1070 MSI GAMING X (my card).

The issue is definitely about turbo boosting not occuring.
Cubox is offline   Reply With Quote
Old 2018-02-04, 15:09   #11
Cubox
 
Dec 2017

52 Posts
Default

Same issue with mfaktc...
Cubox is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Making the most of my time. How? ozzy24 Information & Answers 13 2011-03-11 22:48
maximizing performance through Turbo Boost? ixfd64 Software 6 2010-12-27 04:56
Trouble making msieve CRGreathouse Msieve 5 2009-04-05 18:29
would like a script making. :) Mobilemick Operation Billion Digits 1 2006-01-15 03:49
Making LL test start over E_tron Software 2 2004-05-20 13:27

All times are UTC. The time now is 18:20.


Thu May 19 18:20:25 UTC 2022 up 35 days, 16:21, 2 users, load averages: 1.33, 1.39, 1.34

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔