mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2020-10-23, 09:02   #3400
aheeffer
 
Aug 2020

25 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Thanks guys, I think (again) that I've found the problem. In this case my code was working as expected and overriding the correct amount of credit, but the message displayed to the user was pre-written elsewhere using the default credit amount, I just needed to also rewrite the user message. Can one of the people who've posted above (or anyone with a new TF factor in a fully-factored range reported in the last 8h or more recently) confirm if the GHz-days credit in your Account Result Details page shows the higher (correct) or lower (incorrect) amount of credit for the TF-F (range-fully-factored) result?

processing: TF no-factor for M333930227 (278-279)
CPU credit is 183.3218 GHz-days.
processing: TF factor 656156919987798312067063 for M333930227 (279-280) [range fully factored]
CPU credit is 366.6436 GHz-days.


That seems to match now!
aheeffer is offline   Reply With Quote
Old 2020-10-25, 07:25   #3401
rebirther
 
rebirther's Avatar
 
Sep 2011
Germany

32·172 Posts
Default

Some of our users are running into an error:

on Windows7 (NVIDIA GeForce GTX 1060 6GB (4095MB) driver: 436.15 OpenCL: 1.2):
Quote:
Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 25000
SievePrimesAdjust 1
SievePrimesMin 5000
SievePrimesMax 100000
NumStreams 3
CPUStreams 3
GridSize 3
GPU Sieving enabled
GPUSievePrimes 82486
GPUSieveSize 64Mi bits
GPUSieveProcessSize 16Ki bits
Checkpoints enabled
CheckpointDelay 300s
WorkFileAddDelay disabled
Stages enabled
StopAfterFactor class
PrintMode full
V5UserID (none)
ComputerID (none)
AllowSleep no
TimeStampInResults no

CUDA version info
binary compiled for CUDA 10.0
CUDA runtime version 10.0
CUDA driver version 10.10

CUDA device info
name GeForce GTX 1060 6GB
compute capability 6.1
max threads per block 1024
max shared memory per MP 98304 byte
number of multiprocessors 10
clock rate (CUDA cores) 1708MHz
memory clock rate: 4004MHz
memory bus width: 192 bit

Automatic parameters
threads per grid 655360
GPUSievePrimes (adjusted) 82486
GPUsieve minimum exponent 1055144

running a simple selftest...
Selftest statistics
number of tests 107
successfull tests 107

selftest PASSED!

got assignment: exp=212828017 bit_min=72 bit_max=73 (4.49 GHz-days)
Starting trial factoring M212828017 from 2^72 to 2^73 (4.49 GHz-days)
k_min = 11094325242000
k_max = 22188650486132
Using GPU kernel "barrett76_mul32_gs"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Oct 25 07:49 | 0 0.1% | 1.531 24m28s | 264.20 82485 n.a.%
Oct 25 07:49 | 3 0.2% | 1.487 23m45s | 272.01 82485 n.a.%
Oct 25 07:49 | 12 0.3% | 1.490 23m46s | 271.47 82485 n.a.%
Oct 25 07:49 | 20 0.4% | 1.493 23m47s | 270.92 82485 n.a.%
Oct 25 07:50 | 23 0.5% | 1.491 23m44s | 271.29 82485 n.a.%
Oct 25 07:50 | 24 0.6% | 1.493 23m44s | 270.92 82485 n.a.%
Oct 25 07:50 | 27 0.7% | 1.492 23m42s | 271.10 82485 n.a.%
Oct 25 07:50 | 35 0.8% | 0.859 13m38s | 470.88 82485 n.a.%
Oct 25 07:50 | 39 0.9% | 0.757 12m00s | 534.33 82485 n.a.%
Oct 25 07:50 | 44 1.0% | 0.754 11m56s | 536.45 82485 n.a.%
Oct 25 07:50 | 47 1.1% | 0.755 11m56s | 535.74 82485 n.a.%
Oct 25 07:50 | 48 1.3% | 0.754 11m55s | 536.45 82485 n.a.%
Oct 25 07:50 | 59 1.4% | 0.754 11m54s | 536.45 82485 n.a.%
Oct 25 07:50 | 60 1.5% | 0.773 12m11s | 523.27 82485 n.a.%
Oct 25 07:50 | 63 1.6% | 0.684 10m46s | 591.35 82485 n.a.%
ERROR: cudaGetLastError() returned 77: an illegal memory access was encountered
on linux mint:
Code:
got assignment: exp=140615327 bit_min=72 bit_max=73 (6.80 GHz-days)
Starting trial factoring M140615327 from 2^72 to 2^73 (6.80 GHz-days)
 k_min =  16791791417940
 k_max =  33583582839939
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Oct 23 20:27 |    0   0.1% |  7.842   2h05m |     78.07    82485    n.a.%
M140615327 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 702: the launch timed out and was terminated
The test is running for a short time and errored out. Any ideas?

Last fiddled with by rebirther on 2020-10-25 at 07:53
rebirther is offline   Reply With Quote
Old 2020-10-25, 16:57   #3402
Viliam Furik
 
Jul 2018
Martin, Slovakia

1000001112 Posts
Default

The driver seems to be old. I have 456.71 installed. It may be a problem. That is only a guess.

BTW, I checked the factor, it's not a factor.
Viliam Furik is online now   Reply With Quote
Old 2020-10-25, 17:20   #3403
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2·3·13·41 Posts
Default

Quote:
Originally Posted by rebirther View Post
M140615327 has a factor: 38814612911305349835664385407
Quote:
Originally Posted by Viliam Furik View Post
BTW, I checked the factor, it's not a factor.
Interestingly, it is a factor, but not of M140,615,327. It's actually a composite factor, the smallest composite factor of the smallest two factors of M3,321,928,619.
James Heinrich is online now   Reply With Quote
Old 2020-10-25, 17:42   #3404
Viliam Furik
 
Jul 2018
Martin, Slovakia

263 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Interestingly, it is a factor, but not of M140,615,327. It's actually a composite factor, the smallest composite factor of the smallest two factors of M3,321,928,619.
Well, I didn't check this before, but now I did - the supposed factor is not even 1 mod 140615327.
Viliam Furik is online now   Reply With Quote
Old 2020-10-25, 18:23   #3405
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

4,751 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Interestingly, it is a factor, but not of M140,615,327. It's actually a composite factor, the smallest composite factor of the smallest two factors of M3,321,928,619.
It's a known factor that is valid for one of the selftest exponents, and that frequently appears when there is an error, regardless of the current exponent being tested.
If a factor is reported found that starts 388 and ends 407, check the factor, the logs, the hardware, driver, temperatures, gpu memory reliability, etc.

Since factors f of Mersenne numbers must be of form f= 2 k p + 1,
it could only potentially be a factor of Mersenne numbers with one of those prime factors of (k p) as exponents.
Put (38814612911305349835664385407-1)/2 in https://www.alpertron.com.ar/ECM.HTM and it yields 19407 306455 652674 917832 192703 = 36 × 31081 × 65381 × 3 943673 × 3321 928619.
Code:
########## testcase 1557/2867 ##########
Starting trial factoring M3321928619 from 2^94 to 2^95 (1207701.03 GHz-days)
Using GPU kernel "95bit_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Dec 04 09:54 |  237   1.0% |  0.048    n.a. |      n.a.    82485    n.a.%
M3321928619 has a factor: 38814612911305349835664385407
found 1 factor for M3321928619 from 2^94 to 2^95 [mfaktc 0.21 95bit_mul32_gs]
selftest for M3321928619 passed!
tf(): total time spent:  0.048s

Starting trial factoring M3321928619 from 2^94 to 2^95 (1207701.03 GHz-days)
Using GPU kernel "95bit_mul32"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Dec 04 09:54 |  237   1.0% |  0.099    n.a. |      n.a.     5000    1.74%
M3321928619 has a factor: 38814612911305349835664385407
found 1 factor for M3321928619 from 2^94 to 2^95 [mfaktc 0.21 95bit_mul32]
selftest for M3321928619 passed!
  tf(): total time spent:  0.101s
Following is one example among many on a failing gpu.

Code:
batch wrapper reports mfaktc-win-64.exe (re)launch at Mon 05/28/2018  1:16:53.17 count 3 on model gtx480 dev 0 
mfaktc v0.20 (64bit built)

Compiletime options
  THREADS_PER_BLOCK         256
  SIEVE_SIZE_LIMIT          32kiB
  SIEVE_SIZE                193154bits
  SIEVE_SPLIT               250
  MORE_CLASSES              enabled

Runtime options
  SievePrimes               25000
  SievePrimesAdjust         1
  SievePrimesMin            5000
  SievePrimesMax            100000
  NumStreams                3
  CPUStreams                3
  GridSize                  3
  GPUSievePrimes            82486
  GPUSieveSize              64Mi bits
  GPUSieveProcessSize       16Ki bits
  WorkFile                  worktodo.txt
  Checkpoints               enabled
  CheckpointDelay           900s
  Stages                    enabled
  StopAfterFactor           bitlevel
  PrintMode                 full
  V5UserID                  Kriesel
  ComputerID                dodo-gtx480-0
  ProgressHeader            "Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait"
  ProgressFormat            "%d %T | %C %p%% | %t  %e |   %g  %s  %W%%"
  AllowSleep                no
  TimeStampInResults        yes

CUDA version info
  binary compiled for CUDA  6.50
  CUDA runtime version      6.50
  CUDA driver version       9.10

CUDA device info
  name                      GeForce GTX 480
  compute capability        2.0
  maximum threads per block 1024
  number of multiprocessors 15 (480 shader cores)
  clock rate                1451MHz

Automatic parameters
  threads per grid          983040

running a simple selftest...
Selftest statistics
  number of tests           92
  successfull tests         92

selftest PASSED!

got assignment: exp=329000033 bit_min=80 bit_max=81 (744.28 GHz-days)
Starting trial factoring M329000033 from 2^80 to 2^81 (744.28 GHz-days)
 k_min = 1837273097800140
 k_max = 3674546195606701
Using GPU kernel "barrett87_mul32_gs"

found a valid checkpoint file!
  last finished class was: 2391
  found 0 factor(s) already

Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
May 28 01:18 | 2392  52.2% | 95.831  12h13m |    698.99    82485    n.a.%
M329000033 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 77: an illegal memory access was encountered
batch wrapper reports mfaktc-win-64.exe exited at Mon 05/28/2018  1:18:34.45

Last fiddled with by kriesel on 2020-10-25 at 19:12
kriesel is online now   Reply With Quote
Old 2020-10-25, 18:41   #3406
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2·3·13·41 Posts
Default

Ah, that at least explains how it could come up with a Mersenne factor, albeit for the wrong exponent. I guess really the code should (but doesn't) reset the factor variable when starting a new exponent and so when the "found a factor" code block gets incorrectly triggered due to a hardware/driver error it uses the last value (from the quick-self-test).
James Heinrich is online now   Reply With Quote
Old 2020-10-25, 20:48   #3407
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

4,751 Posts
Default

The known usually bad factor 388...407 comes up in discussion about annually at least since 2017. From my recently updated mfaktc thread notes file:

posts 2787 - 2806 2017-12-31 https://www.mersenneforum.org/showpo...postcount=2787
false factor report 38814612911305349835664385407 and ensuing discussion including ways to attempt eliminating its appearance

posts 2824-2827 July 2018 reproducible false factor; bad gpu ram https://www.mersenneforum.org/showpo...postcount=2824

post 3167 2019-07-02 the usual false factor is (also) seen to correlate with Windows TDRs https://www.mersenneforum.org/showpo...postcount=3167
(Check Windows system event log for Windows TDR events)

post 3177 2019-07-21 TheJudger able to reproduce the issue https://www.mersenneforum.org/showpo...postcount=3177

And now, also posts 3401-3406+, Oct 2020.

In a nutshell; cooling/temperatures; bad gpu ram; other hardware problems; gpu too slow; default or inadequate values in Windows TDR related registry entries

Last fiddled with by kriesel on 2020-10-25 at 20:50
kriesel is online now   Reply With Quote
Old 2020-10-25, 23:53   #3408
Ensigm
 
Aug 2020

3×5×7 Posts
Default mfaktc for CUDA 11

Is there somewhere I can find a compiled mfaktc binary for CUDA 11 (or a convenient way to compile it)? I'm using Google colab and recently it started giving me runtimes with CUDA 11.



Code:
CUDA version info
  binary compiled for CUDA  10.10
  CUDA runtime version      0.75
  CUDA driver version       11.10
Ensigm is offline   Reply With Quote
Old 2020-10-26, 00:55   #3409
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

4,751 Posts
Default

Would someone please post CUDA11 mfaktc (2047 limit or higher GpuSieveSize; at least more-classes) compiled packages for Ubuntu Linux or Windows 7-10? I haven't seen any of either.

TheJudger tells how to compile for CUDA10 and how to set up for that in post 2910. Seems a good likely starting point for tackling CUDA11.

There's also https://www.mersenneforum.org/showpo...postcount=3086
to 3088, nomead getting his system to compile mfaktc in Windows
https://www.mersenneforum.org/showpo...postcount=3088 updated build process by nomead

Once posted, James Heinrich may add them to the mersenne.ca download mirror.
kriesel is online now   Reply With Quote
Old 2020-10-26, 14:38   #3410
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

4,751 Posts
Default Windows TDR related false factor on s l o w gpu

This slow gpu came with one of the used systems I bought. I check out any model I get access to, and this one was interesting in what it revealed about a possible source of the known false factor occurrence.

The ancient slow NVIDIA NVS295 with Windows default TDR settings (note the bogus 200 GhzD/day indication also):
Code:
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:45:32.15 
mfaktc v0.21 (64bit built)

Compiletime options
  THREADS_PER_BLOCK         256
  SIEVE_SIZE_LIMIT          32kiB
  SIEVE_SIZE                193154bits
  SIEVE_SPLIT               250
  MORE_CLASSES              enabled

Runtime options
  SievePrimes               25000
  SievePrimesAdjust         1
  SievePrimesMin            5000
  SievePrimesMax            100000
  NumStreams                3
  CPUStreams                3
  GridSize                  3
  GPU Sieving               enabled
  GPUSievePrimes            82486
  GPUSieveSize              64Mi bits
  GPUSieveProcessSize       16Ki bits
  Checkpoints               enabled
  CheckpointDelay           900s
  WorkFileAddDelay          3600s
  Stages                    enabled
  StopAfterFactor           bitlevel
  PrintMode                 full
  V5UserID                  kriesel
  ComputerID                eaglet-nvs295
  AllowSleep                no
  TimeStampInResults        yes

CUDA version info
  binary compiled for CUDA  6.50
  CUDA runtime version      6.50
  CUDA driver version       6.50

CUDA device info
  name                      Quadro NVS 295
  compute capability        1.1
  max threads per block     512
  max shared memory per MP  16384 byte
  number of multiprocessors 1
  CUDA cores per MP         8
  CUDA cores - total        8
  clock rate (CUDA cores)   1300MHz
  memory clock rate:        695MHz
  memory bus width:         64 bit

Automatic parameters
  threads per grid          1048576
  GPUSievePrimes (adjusted) 82486
  GPUsieve minimum exponent 1055144

running a simple selftest...
Selftest statistics
  number of tests           107
  successfull tests         107

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:46 |    0   0.1% |  3.548  56m43s |    202.20    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:46:14.24mfaktc quadro nvs295 exit logged by batch wrapper
After modifying the Windows TDR registry settings:
Code:
selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 21:05 |    0   0.1% | 411.97   4d13h |      1.74    82485    n.a.%
Jul 02 21:11 |    5   0.2% | 411.85   4d13h |      1.74    82485    n.a.%
Jul 02 21:18 |    9   0.3% | 411.49   4d13h |      1.74    82485    n.a.%
Jul 02 21:25 |   12   0.4% | 411.39   4d13h |      1.74    82485    n.a.%
Jul 02 21:32 |   20   0.5% | 411.78   4d13h |      1.74    82485    n.a.%
Jul 02 21:39 |   21   0.6% | 412.10   4d13h |      1.74    82485    n.a.%
Jul 02 21:46 |   29   0.7% | 411.38   4d12h |      1.74    82485    n.a.%
Jul 02 21:53 |   32   0.8% | 410.54   4d12h |      1.75    82485    n.a.%
Jul 02 21:59 |   36   0.9% | 410.11   4d12h |      1.75    82485    n.a.%
Jul 02 22:06 |   41   1.0% | 410.09   4d12h |      1.75    82485    n.a.%
Jul 02 22:13 |   44   1.1% | 410.10   4d12h |      1.75    82485    n.a.%
Jul 02 22:20 |   56   1.3% | 411.00   4d12h |      1.75    82485    n.a.%
Jul 02 22:27 |   57   1.4% | 412.03   4d12h |      1.74    82485    n.a.%
Jul 02 22:34 |   60   1.5% | 411.38   4d12h |      1.74    82485    n.a.%
Jul 02 22:41 |   65   1.6% | 410.46   4d11h |      1.75    82485    n.a.%
Jul 02 22:47 |   69   1.7% | 410.11   4d11h |      1.75    82485    n.a.%
Jul 02 22:54 |   77   1.8% | 410.10   4d11h |      1.75    82485    n.a.%
Jul 02 23:01 |   81   1.9% | 410.11   4d11h |      1.75    82485    n.a.%
Jul 02 23:08 |   84   2.0% | 410.12   4d11h |      1.75    82485    n.a.%
Jul 02 23:15 |   89   2.1% | 410.07   4d11h |      1.75    82485    n.a.%
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 23:22 |   92   2.2% | 410.44   4d11h |      1.75    82485    n.a.%
Jul 02 23:28 |   96   2.3% | 411.38   4d11h |      1.74    82485    n.a.%
Jul 02 23:35 |  104   2.4% | 410.48   4d10h |      1.75    82485    n.a.%
Jul 02 23:42 |  117   2.5% | 410.10   4d10h |      1.75    82485    n.a.%
Jul 02 23:49 |  120   2.6% | 410.09   4d10h |      1.75    82485    n.a.%
Jul 02 23:56 |  125   2.7% | 410.10   4d10h |      1.75    82485    n.a.%
...
Jul 07 10:07 | 4601  99.8% | 410.11  13m40s |      1.75    82485    n.a.%
Jul 07 10:14 | 4605  99.9% | 410.09   6m50s |      1.75    82485    n.a.%
Jul 07 10:21 | 4617 100.0% | 411.44   0m00s |      1.74    82485    n.a.%
no factor for M119998999 from 2^72 to 2^73 [mfaktc 0.21 barrett76_mul32_gs]
 tf(): total time spent: 4d 13h 23m  8.488s
After that long experiment, it was removed and placed in the last-resort spare-parts bin.
For more info on TDR, see https://docs.nvidia.com/gameworks/co...n_recovery.htm and https://www.mersenneforum.org/showpo...3&postcount=10
kriesel is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1657 2020-10-27 01:23
The P-1 factoring CUDA program firejuggler GPU Computing 752 2020-09-08 16:15
"CUDA runtime version 0.0" when running mfaktc.exe froderik GPU Computing 4 2016-10-30 15:29
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26

All times are UTC. The time now is 21:15.

Wed Dec 2 21:15:45 UTC 2020 up 83 days, 18:26, 2 users, load averages: 4.42, 5.19, 5.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.