mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)
-   -   Faster GPU-ECM with CGBN (https://www.mersenneforum.org/showthread.php?t=27103)

SethTro 2021-09-01 22:46

[QUOTE=frmky;587024]cudacommon.h is missing from the git repository.[/QUOTE]

Fixed along with another issue.

SethTro 2021-09-02 00:44

[QUOTE=bsquared;587026]1280: (~31 ms/curves)
2560: (~21 ms/curves)
640: (~63 ms/curves)
1792: (~36 ms/curves)

So we have a winner! -gpucurves 2560 beats all the others and anything the old build could do as well (best on the old build was 5120 @ (~25 ms/curves))

With the smaller kernel (running (2^499-1) / 20959), -gpucurves 5120 is fastest at about 6ms/curve on both new and old builds.[/QUOTE]

I added `gpu_throughput_test.sh` which runs different sized inputs and measures throughput.

On my system maximum results are achieved at

256 bits: 2x default curves (or 3584 curves), same speed at 4x default too
512 bits: 2x and 4x default curves
1024 bits: only at default curves
extra testing at 2048 bits: 1.5x and 3x outperform 2x and 4x slightly

SethTro 2021-09-02 00:45

[QUOTE=SethTro;587033]I added `gpu_throughput_test.sh` which runs different sized inputs and measures throughput.

On my system maximum results are achieved at

256 bits: 2x default curves (or 3584 curves), same speed at 4x default too
512 bits: 2x and 4x default curves
1024 bits: only at default curves
extra testing at 2048 bits: 1.5x and 3x outperform 2x and 4x slightly[/QUOTE]

Maybe this relates to registers used by the kernel? max threads per block? Any insight from CUDA experts would be appreciated

SethTro 2021-09-02 09:12

I halved compile time by adding cgbn_swap and avoiding inlining double_add_v2 twice.

Sadly I pushed the branch and it will probably fail to compile for everyone till [url]https://github.com/NVlabs/CGBN/pull/17[/url] gets pulled

---

@bsquared, you might try changing TPB_DEFAULT from 128 to 512, In some initial testing it looks like larger gpucurves don't slow down any more with ./gpu_throughput_test.sh more testing to follow tomorrow.

chris2be8 2021-09-02 15:39

[QUOTE=henryzz;587025]My guess is that your gcc version may be too old. I would try the most recent version you can get your hands on. The easiest way may be to update your OS into a version that isn't end of life.[/QUOTE]

I've installed gcc-6 (the latest in the repositories) and that gets past that error, but fails a bit further on:
[code]
gcc-6 --version
gcc-6 (SUSE Linux) 6.2.1 20160826 [gcc-6-branch revision 239773]
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

./configure --enable-gpu=30 --with-cuda=/usr/local/cuda CC=gcc-6 -with-cgbn-include=/home/chris/CGBN/include/cgbn
...
configure: Using cuda.h from /usr/local/cuda/include
checking cuda.h usability... yes
checking cuda.h presence... yes
checking for cuda.h... yes
checking that CUDA Toolkit version is at least 3.0... (9.0) yes
configure: Using CUDA dynamic library from /usr/local/cuda/lib64
checking for cuInit in -lcuda... yes
checking that CUDA Toolkit version and runtime version are the same... no
configure: error: 'cuda.h' and 'cudart' library have different versions, you have to reinstall CUDA properly, or use the --with-cuda parameter to tell configure the path to the CUDA library and header you want to use
[/code]

That error message doesn't make much sense because I only have one version of CUDA installed on the system. So it's probably failing to compile a test program.

So I'll try upgrading the OS next. Then install later versions of CUDA and gcc.

SethTro 2021-09-02 20:06

[QUOTE=chris2be8;587067]I've installed gcc-6 (the latest in the repositories) and that gets past that error, but fails a bit further on:
[code]
configure: error: 'cuda.h' and 'cudart' library have different versions, you have to reinstall CUDA properly, or use the --with-cuda parameter to tell configure the path to the CUDA library and header you want to use
[/code]

That error message doesn't make much sense because I only have one version of CUDA installed on the system. So it's probably failing to compile a test program.
[/QUOTE]

You can find the literal program it failed to compile in config.log or the shape in acinclude.m4 (basically wrap the 2nd block in int maint() { ... })

[CODE]
AC_RUN_IFELSE([AC_LANG_PROGRAM([
[
#include <stdio.h>
#include <string.h>
#include <cuda.h>
#include <cuda_runtime.h>
]],[[
int libversion;
cudaError_t err;
err = cudaRuntimeGetVersion (&libversion);
if (err != cudaSuccess)
{
printf ("Could not get runtime version\n");
printf ("Error msg: %s\n", cudaGetErrorString(err));
return -1;
}
printf("(%d.%d/", CUDA_VERSION/1000, (CUDA_VERSION/10) % 10);
printf("%d.%d) ", libversion/1000, (libversion/10) % 10);
if (CUDA_VERSION == libversion)
return 0;
else
return 1;
]])],
[/CODE]

And you can find the command line it tried to compile this with in config.log too (my guess is something like gcc-9 -o conftest -I/usr/local/cuda/include -g -O2 -I/usr/local/cuda/include -Wl,-rpath,/usr/local/cuda/lib64 -L/usr/ local/cuda/lib64 conftest.c -lcudart -lstdc++ -lcuda -lrt )

frmky 2021-09-02 22:28

I think this can be triggered if the version of CUDA supported by the driver doesn't match the toolkit version. But this is usually ok as long as the driver is a little newer. I think both this and the lack of cuInit() in the CUDA lib should be warnings, not errors. Both of these are ok in some circumstances.

SethTro 2021-09-02 23:57

Happy me!

I found two 35 digit factors from a [URL="http://factordb.com/index.php?id=1100000002657449020"]C303[/URL] today (from [URL="https://docs.google.com/spreadsheets/d/1IuxGlf6dEUd8Qixu87P-_r6sgdG7Yl8UUPXS6rKBpbM/edit#gid=1905095108"]Factoring for a publication[/URL])

[CODE]
GPU: factor 404157820975138535541421971085010741 found in Step 1 with curve 1796 (-sigma 3:1850760857)
GPU: factor 404157820975138535541421971085010741 found in Step 1 with curve 2049 (-sigma 3:1850761110)
GPU: factor 404157820975138535541421971085010741 found in Step 1 with curve 2449 (-sigma 3:1850761510)
Computing 3584 Step 1 took 2294ms of CPU time / 1816867ms of GPU time
********** Factor found in step 1: 404157820975138535541421971085010741
Found prime factor of 36 digits: 404157820975138535541421971085010741
[/CODE]

Then
[CODE]
Thu 2021/09/02 23:25:50 UTC Step 1 took 0ms
Thu 2021/09/02 23:25:50 UTC Step 2 took 9668ms
Thu 2021/09/02 23:25:50 UTC ********** Factor found in step 2: 51858345311243630596653971633910169
Thu 2021/09/02 23:25:50 UTC Found prime factor of 35 digits: 51858345311243630596653971633910169
[/CODE]

Feels good that this code is being useful :)

frmky 2021-09-03 07:02

[QUOTE=SethTro;587108]Feels good that this code is being useful :)[/QUOTE]
Nearly all of the factors that I found for Factoring for a Publication 2 used this code.

chris2be8 2021-09-04 16:00

I'm still puzzling over it. I've upgraded the system to openSUSE Leap 15.3 and installed CUDA 11.4. But no matter what I do [c]lspci -v[/c] still says [c]Kernel modules: nouveau[/c]

I've tried everything I can find in the CUDA Installation Guide for Linux. And everything I can find on the web. But it still loads the nouveau kernel module, not the one shipped with CUDA. Has anyone any idea how to get it to use the Nvidia drivers?

NB. On the system with the GTX 970:
[code]
4core:~ # lspci -v -s 01:00
01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1) (prog-if 00 [VGA controller])
Subsystem: eVga.com. Corp. Device 3978
Flags: fast devsel, IRQ 11
Memory at f6000000 (32-bit, non-prefetchable) [disabled] [size=16M]
Memory at e0000000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at f0000000 (64-bit, prefetchable) [disabled] [size=32M]
I/O ports at e000 [disabled] [size=128]
Expansion ROM at f7000000 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Kernel modules: nouveau
[/code]

Compare with on the system with a CC 3.0 card:
[code]
root@sirius:~# lspci -v -s 07:00
07:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 760] (rev a1) (prog-if 00 [VGA controller])
Subsystem: Micro-Star International Co., Ltd. [MSI] GK104 [GeForce GTX 760]
Flags: bus master, fast devsel, latency 0, IRQ 76
Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
Memory at e8000000 (64-bit, prefetchable) [size=128M]
Memory at f0000000 (64-bit, prefetchable) [size=32M]
I/O ports at e000 [size=128]
[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 <?>
Capabilities: [100] Virtual Channel
Capabilities: [128] Power Budgeting <?>
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
[/code]

Compare the last line of output in each case.

If it's because CUDA 11.4 doesn't support this card I could try removing CUDA 11.4 and installing CUDA 10.x But would that work.

paulunderwood 2021-09-04 16:58

Following [URL="https://askubuntu.com/questions/1095825/what-does-modprobe-blacklist-nouveau-do"]this solution[/URL] (although Ubuntu)

[QUOTE]
Boot to Ubuntu, but before you login in to Ubuntu, press Cntrl+Alt+F2

run the following command:

sudo nano /etc/modprobe.d/blacklist-nouveau.conf

add the 2 following lines, save & exit

blacklist nouveau
options nouveau modeset=0

run the following command

sudo update-initramfs -u

[/QUOTE]

reboot.

run [c]lsmod | grep nvidia[/c]

HTH


All times are UTC. The time now is 18:30.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.