![]() |
![]() |
#3158 |
Bemusing Prompter
"Danny"
Dec 2002
California
34·29 Posts |
![]()
I'm getting the following errors while trying to compile mfaktc on a Linux system:
Code:
nvcc fatal : Unsupported gpu architecture 'compute_11' Makefile:55: recipe for target 'tf_72bit.o' failed |
![]() |
![]() |
![]() |
#3159 |
Sep 2003
32×7×41 Posts |
![]() |
![]() |
![]() |
![]() |
#3160 |
"Sam Laur"
Dec 2018
Turku, Finland
2·3·5·11 Posts |
![]()
You need to modify the Makefile to make sure the compute_xx and sm_xx values match the hardware you're going to run mfaktc on. You're getting this error because newer versions of CUDA don't support compute capability 1.1 anymore (CUDA SDK 6.5 was the last one that did). But that's for an old architecture, 1.1-1.3 is for cards that are over 10 years old by now, GTX 2xx series and the like. Conversely, the newest cards (RTX20 / GTX16) will need compute capability 7.5 and that won't be supported under anything older than CUDA SDK 10.0. So find out what you need, put it in the Makefile, check that your CUDA version is OK and recompile. That should do it.
|
![]() |
![]() |
![]() |
#3161 |
Bemusing Prompter
"Danny"
Dec 2002
California
92D16 Posts |
![]()
Thanks. Commenting out the appropriate lines in the makefile resolved the issue.
|
![]() |
![]() |
![]() |
#3162 | |
Bemusing Prompter
"Danny"
Dec 2002
California
34·29 Posts |
![]()
I recently got access to two Tesla V100 GPUs. However, running mfaktc gives me the following error:
Quote:
Code:
NVCCFLAGS += --generate-code arch=compute_70,code=sm_70 |
|
![]() |
![]() |
![]() |
#3163 | |
"Oliver"
Mar 2005
Germany
2·3·5·37 Posts |
![]() Quote:
Oliver |
|
![]() |
![]() |
![]() |
#3164 |
Bemusing Prompter
"Danny"
Dec 2002
California
34·29 Posts |
![]()
It might be a good idea for the makefile to have rules for different CUDA versions. For example make cuda10 would automatically apply the flags for compute capability 3.0 to 7.5.
Ideally, non-developers shouldn't have to know how to modify makefiles. At the very least, the documentation should probably be updated. |
![]() |
![]() |
![]() |
#3165 |
Apr 2019
5·41 Posts |
![]()
Hi, I've just built mfaktc-0.21 against CUDA 10.1 on Linux, with a GTX 970, and I'm getting errors like this:
Code:
mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 1 CPUStreams 1 GridSize 0 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 4Mi bits GPUSieveProcessSize 8Ki bits Checkpoints enabled CheckpointDelay 30s WorkFileAddDelay 600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID (none) ComputerID (none) AllowSleep no TimeStampInResults no CUDA version info binary compiled for CUDA 10.10 CUDA runtime version 10.10 CUDA driver version 10.10 CUDA device info name GeForce GTX 970 compute capability 5.2 max threads per block 1024 max shared memory per MP 98304 byte number of multiprocessors 13 CUDA cores per MP 128 CUDA cores - total 1664 clock rate (CUDA cores) 1240MHz memory clock rate: 3505MHz memory bus width: 256 bit Automatic parameters threads per grid 106496 ERROR: cudaStreamCreate() failed for stream 0 cudaGetLastError() returned 2: out of memory About every 1 in 5-10 attempts it actually seems to get past this error and then everything seems to run fine (I completed -st and -st2 with all tests passing). But most of the times I just get that "out of memory" error before it even gets started. Any ideas what is wrong? Is it likely due to some improper configuration on my part when building? Using nvidia-smi to report utilizatoin (without mfaktc running) I see memory usage of: 152MiB / 4039MiB Is it possible it is complaining about regular system memory, not GPU memory? I have 16GB of actual RAM, which is currently all used up by another process, but there is still gobs of swap space free on this system (>300GB) from an NVMe SSD. |
![]() |
![]() |
![]() |
#3166 |
Apr 2019
5·41 Posts |
![]()
Nevermind about the above. It seems the issue was fixed after rebooting. I had been waiting for a very long job to finish and I think I must have forgotten to reboot since installing CUDA, causing it to behave in all kinds of strange ways.
|
![]() |
![]() |
![]() |
#3167 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
484310 Posts |
![]()
Playing with an old slow gpu from my spare parts bin, on an also old CORE 2 DUO system, after it passes ~2900 self tests, I found that TDR timeouts reliably produce the known false factor 38814612911305349835664385407 in mfaktc:
Code:
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:01:57.14 mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 3 CPUStreams 3 GridSize 3 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 64Mi bits GPUSieveProcessSize 16Ki bits Checkpoints enabled CheckpointDelay 900s WorkFileAddDelay 600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID kriesel ComputerID eaglet-nvs295 AllowSleep no TimeStampInResults yes CUDA version info binary compiled for CUDA 6.50 CUDA runtime version 6.50 CUDA driver version 6.50 CUDA device info name Quadro NVS 295 compute capability 1.1 max threads per block 512 max shared memory per MP 16384 byte number of multiprocessors 1 CUDA cores per MP 8 CUDA cores - total 8 clock rate (CUDA cores) 1300MHz memory clock rate: 695MHz memory bus width: 64 bit Automatic parameters threads per grid 1048576 GPUSievePrimes (adjusted) 82486 GPUsieve minimum exponent 1055144 running a simple selftest... Selftest statistics number of tests 107 successfull tests 107 selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:02 | 0 0.1% | 3.928 1h02m | 182.63 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:02:39.49mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:02:39.53 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:03 | 0 0.1% | 3.573 57m07s | 200.78 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:03:21.71mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:03:21.72 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:04 | 0 0.1% | 3.604 57m36s | 199.05 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:04:03.89mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:08:00.62 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:08 | 0 0.1% | 3.629 58m00s | 197.68 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:08:42.44mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:08:42.60 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:09 | 0 0.1% | 3.606 57m38s | 198.94 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:09:25.30mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:09:25.36 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:10 | 0 0.1% | 3.628 57m59s | 197.74 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:10:07.93mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:30:12.32 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:30 | 0 0.1% | 3.557 56m51s | 201.68 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:30:54.43mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:30:54.49 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:31 | 0 0.1% | 3.585 57m18s | 200.11 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:31:36.79mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:31:37.01 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:32 | 0 0.1% | 4.533 1h12m | 158.26 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:32:20.23mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:44:07.67 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:44 | 0 0.1% | 3.522 56m18s | 203.69 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:44:49.76mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:44:49.82 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:45 | 0 0.1% | 3.556 56m50s | 201.74 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:45:32.10mfaktc quadro nvs295 exit logged by batch wrapper batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:45:32.15 mfaktc v0.21 (64bit built) ... selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 20:46 | 0 0.1% | 3.548 56m43s | 202.20 82485 n.a.% M119998999 has a factor: 38814612911305349835664385407 ERROR: cudaGetLastError() returned 30: unknown error at Tue 07/02/2019 20:46:14.24mfaktc quadro nvs295 exit logged by batch wrapper https://docs.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys yields success of a sort: Code:
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:57:36.34 mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 3 CPUStreams 3 GridSize 3 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 64Mi bits GPUSieveProcessSize 16Ki bits Checkpoints enabled CheckpointDelay 900s WorkFileAddDelay 3600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID kriesel ComputerID eaglet-nvs295 AllowSleep no TimeStampInResults yes CUDA version info binary compiled for CUDA 6.50 CUDA runtime version 6.50 CUDA driver version 6.50 CUDA device info name Quadro NVS 295 compute capability 1.1 max threads per block 512 max shared memory per MP 16384 byte number of multiprocessors 1 CUDA cores per MP 8 CUDA cores - total 8 clock rate (CUDA cores) 1300MHz memory clock rate: 695MHz memory bus width: 64 bit Automatic parameters threads per grid 1048576 GPUSievePrimes (adjusted) 82486 GPUsieve minimum exponent 1055144 running a simple selftest... Selftest statistics number of tests 107 successfull tests 107 selftest PASSED! got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days) Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days) k_min = 19676691147960 k_max = 39353382296711 Using GPU kernel "barrett76_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 02 21:05 | 0 0.1% | 411.97 4d13h | 1.74 82485 n.a.% Jul 02 21:11 | 5 0.2% | 411.85 4d13h | 1.74 82485 n.a.% Jul 02 21:18 | 9 0.3% | 411.49 4d13h | 1.74 82485 n.a.% Jul 02 21:25 | 12 0.4% | 411.39 4d13h | 1.74 82485 n.a.% |
![]() |
![]() |
![]() |
#3168 |
Apr 2019
5·41 Posts |
![]()
Are there any particular recommended settings that would help to maximize throughput for TF of many large exponents (between 10^9 and 2^32) to fairly low bit levels? I've tried a few different combinations to reduce overhead, but nothing in particular seems to make a noticeable difference.
I'm using a "less classes" build by the way. Also strangely I seem to get the better performance from my laptop Quadro M1000M vs a GTX 780. I would have expected the GTX to be faster , having more than 4x CUDA core count of the mobile quadro. https://www.videocardbenchmark.net/c...780/3349vs2525 Although, the quadro has a bit more core clock, is that all that matters in this case? |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1668 | 2020-12-22 15:38 |
The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |