![]() |
![]() |
#3499 |
"Seth"
Apr 2019
24·33 Posts |
![]()
Math is hard
Code:
$ cat worktodo.txt Factor=N/A,960477823,66,67 Factor=N/A,960477823,67,68 $ ./mfaktc.exe mfaktc v0.21 (64bit built) ... got assignment: exp=960477823 bit_min=66 bit_max=67 (0.02 GHz-days) Starting trial factoring M960477823 from 2^66 to 2^67 (0.02 GHz-days) k_min = 38411594760 k_max = 76823196254 Using GPU kernel "barrett76_mul32_gs" M960477823 has a factor: 147602823780943516039 found 1 factor for M960477823 from 2^66 to 2^67 [mfaktc 0.21 barrett76_mul32_gs] WARNING: ignoring line 1 in "worktodo.txt"! Reason: doesn't begin with Factor= WARNING: ignoring line 2 in "worktodo.txt"! Reason: doesn't begin with Factor= got assignment: exp=960477823 bit_min=67 bit_max=68 (0.03 GHz-days) Starting trial factoring M960477823 from 2^67 to 2^68 (0.03 GHz-days) k_min = 76823194140 k_max = 153646392509 M960477823 has a factor: 147602823780943516039 found 1 factor for M960477823 from 2^67 to 2^68 [mfaktc 0.21 barrett76_mul32_gs] $ python -c 'import math; print(math.log2(147602823780943516039))' 67.00028221952357 |
![]() |
![]() |
![]() |
#3500 |
"Viliam Furík"
Jul 2018
Martin, Slovakia
3·251 Posts |
![]()
Yes, there is an overlap in the k_max (76823196254) of the 67-bit range and k_min (76823194140) of the 68-bit range. So the k of this composite factor, being 76838225853, can be found in both ranges.
|
![]() |
![]() |
![]() |
#3501 |
Mar 2014
22·13 Posts |
![]()
I have just received a shiny new laptop, and am having some trouble getting it set up.
I have run mfaktc and mfakto, once each, on previous machines, and remember it mostly being a matter of having drivers up to date and picking the right one of mfaktc or mfakto. This time has been harder. Would appreciate some advice: Windows 10, it-10750H CPU@2.60Ghz. 32GB RAM. NVIDIA GeForce GTX 1660 Ti video card. I downloaded and unzipped mfaktc-0.21.win_cuda11.2-2047.zip. I downloaded and installed NVIDIA's latest set of tools (cuda_11.4.1_471.41_win10.exe), and when that didn't work, uninstalled it and tried again with cuda_11.2.0_460.89_win10.exe. I grabbed cudart64_110.dll off the web (I think off this forum!) and tried placing it various places - in the system32 directory, in the mfaktc directory, in the same place as the other NVIDIA dlls. Each time the self-test exits with error 209: no kernel image is available for execution on the device. Any suggestions what to try next welcome. Is there a cudart64_112.dll I need? (I didn't run across one on the web.) A different directory I need to place the cudart file in? Something else obvious I did wrong? Complete self-test result pasted below: D:\grb\math\mfaktc>mfaktc-win-64 -st mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 3 CPUStreams 3 GridSize 3 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 2047Mi bits GPUSieveProcessSize 16Ki bits Checkpoints enabled CheckpointDelay 30s WorkFileAddDelay 600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID (none) ComputerID (none) AllowSleep no TimeStampInResults no CUDA version info binary compiled for CUDA 11.20 CUDA runtime version 11.20 CUDA driver version 11.20 CUDA device info name GeForce GTX 1660 Ti compute capability 7.5 max threads per block 1024 max shared memory per MP 65536 byte number of multiprocessors 24 clock rate (CUDA cores) 1590MHz memory clock rate: 6001MHz memory bus width: 192 bit Automatic parameters threads per grid 786432 GPUSievePrimes (adjusted) 82486 GPUsieve minimum exponent 1055144 ########## testcase 1/2867 ########## Starting trial factoring M50804297 from 2^67 to 2^68 (0.59 GHz-days) Using GPU kernel "75bit_mul32_gs" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Aug 12 09:26 | 3387 0.1% | 0.001 n.a. | n.a. 82485 n.a.% ERROR: cudaGetLastError() returned 209: no kernel image is available for execution on the device D:\grb\math\mfaktc> |
![]() |
![]() |
![]() |
#3502 |
"James Heinrich"
May 2004
ex-Northern Ontario
3×1,237 Posts |
![]() |
![]() |
![]() |
![]() |
#3503 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
647610 Posts |
![]() Quote:
Use the reference info, such as https://www.mersenneforum.org/showpo...18&postcount=1 https://www.mersenneforum.org/showpo...1&postcount=11 to better understand compatibility requirements. Good luck. Code:
mfaktc v0.21 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 4 CPUStreams 3 GridSize 3 GPUSievePrimes 92000 GPUSieveSize 2047Mi bits GPUSieveProcessSize 32Ki bits Checkpoints enabled CheckpointDelay 600s WorkFileAddDelay 3600s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID kriesel ComputerID asrock-gtx1650Super ProgressHeader "Date Time | class Pct | time ETA | GHz-d/day Sieve Wait" ProgressFormat "%d %T | %C %p%% | %t %e | %g %s %W%%" AllowSleep yes TimeStampInResults yes CUDA version info binary compiled for CUDA 10.0 CUDA runtime version 10.0 CUDA driver version 11.0 CUDA device info name GeForce GTX 1650 SUPER compute capability 7.5 max threads per block 1024 max shared memory per MP 65536 byte number of multiprocessors 20 clock rate (CUDA cores) 1740MHz memory clock rate: 6001MHz memory bus width: 128 bit Automatic parameters threads per grid 655360 random selftest offset 11535 GPUSievePrimes (adjusted) 92726 GPUsieve minimum exponent 1197042 running a simple selftest... Selftest statistics number of tests 107 successfull tests 107 selftest PASSED! |
|
![]() |
![]() |
![]() |
#3504 |
Mar 2014
648 Posts |
![]()
Thanks, james and kriesel.
I am up and running with the cuda 10 version. GRB |
![]() |
![]() |
![]() |
#3505 | ||
"Seth"
Apr 2019
24·33 Posts |
![]() Quote:
Quote:
I want to check if <factor> mod (2 * k * exponent + 1) == 0 where <factor> doesn't fit in a int64 I break factor up to it's base 10 representation (which is what I have in the char*): digit * 10^0 + digit_2 * 10^1 + digit_3 * 10^3 + digit_4 * 10^3... I sum each digit * (10^n mod (2*k*exponent+1) to get a congruent sum. I only need to check a handful of divisions to remove 99% of composites. I tested with -st and -st2 and also verified that a bunch of previously found "factors" are no longer found. Let me know how I can help get this committed. |
||
![]() |
![]() |
![]() |
#3506 |
"Seth"
Apr 2019
1B016 Posts |
![]()
I tested this over a wider range of assignments and realized a mistake. I mistakenly assumed k had to be odd.
And my patch needs this tiny change. --- a/src/output.c +++ b/src/output.c @@ -403,8 +403,7 @@ int is_small_composite(uint64_t exponent, char *factor) * composites > (4 * 10^8 * exponent^2) can pass, but require much high bitlevels. */ int len = strlen(factor); - - for (uint64_t k = 1; k <= 10000; k += 2) + for (uint64_t k = 1; k <= 10000; k++) |
![]() |
![]() |
![]() |
#3507 |
Aug 2002
2×3×1,409 Posts |
![]() |
![]() |
![]() |
![]() |
#3508 |
Bemusing Prompter
"Danny"
Dec 2002
California
2·32·137 Posts |
![]()
I found a paper on GPU modular exponentiation that I don't think has been mentioned here before: https://eprint.iacr.org/2007/187.pdf
However, the paper is 14 years old. Does it contain anything that may be useful for us, or is it only stuff we already know? |
![]() |
![]() |
![]() |
#3509 | |
Sep 2006
The Netherlands
2·17·23 Posts |
![]() Quote:
In short it can run completely trivial embarassingly parallel at the gpu. Supersimple. |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1684 | 2022-04-19 20:25 |
gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 40 | 2021-12-27 12:45 |
The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |