![]() |
![]() |
#375 |
"Dylan"
Mar 2017
25016 Posts |
![]()
Attached is a CUDA 11.2 binary of mmff compiled on an Arch Linux system, using the cleaned up source posted by Fan Ming here. These should work on any Linux system with the latest Nvidia drivers.
This one does seem to work with MM107 and MM127, using the examples posted by Fan Ming in posts 363 and 364 with the flag -v 3. MM127: Code:
mmff v0.28 (64bit built) Compiletime options THREADS_PER_BLOCK 256 MORE_CLASSES enabled Runtime options GPU Sieving enabled GPUSievePrimes 500000 GPUSieveSize 32M bits GPUSieveProcessSize 16K bits WorkFile worktodo.txt Checkpoints enabled CheckpointDelay 120s StopAfterFactor disabled PrintMode full V5UserID (none) ComputerID (none) WARNING, no GPUProgressHeader specified in mmff.ini, using default GPUProgressHeader " class | raw cand. | time | ETA | raw rate | SievePrimes" WARNING, no GPUProgressFormat specified in mmff.ini, using default GPUProgressFormat "%C/4620 | %n | %ts | %e | %rM/s | %s" TimeStampInResults no CUDA version info binary compiled for CUDA 11.20 CUDA runtime version 11.20 CUDA driver version 11.20 CUDA device info name GeForce GTX 1660 Ti compute capability 7.5 maximum threads per block 1024 number of mutliprocessors 24 (unknown number of shader cores) clock rate 1590MHz got assignment: MM127, k range 562949953421312 to 1125899906842623 (178-bit factors) Starting trial factoring of MM127 in k range: 562949953421312 to 1125899906842623 (178-bit factors) k_min = 562949953421312 k_max = 1125899906842623 Using GPU kernel "mfaktc_barrett183_M127gs" Verifying (2^(2^127)) % 191561944857917697129840166812120120096271125295021529 = 158757927754760480688654173499199469295287057656270356 Verifying (2^(2^127)) % 191614694258348779445950559282708892489982390750176689 = 33662559093375555778002927546058307399215184129861713 Verifying (2^(2^127)) % 191667446858012590842648103696906711574948849832403649 = 58322051460264670631592692291826830098619657940589851 Verifying (2^(2^127)) % 191720197063361195168223175182300315372061913389308569 = 161044083194471348086645110435896183576905312890737395 Verifying (2^(2^127)) % 191772947100494614230101526639209315731354679296043129 = 62574822488929725322867766087082605619226720248185889 Verifying (2^(2^127)) % 191825698154779667550033876773029890149243076788387249 = 178143699218529778276359107673564439346000111582072990 Verifying (2^(2^127)) % 191878449278237320417654597759685254765861316305100489 = 167970576150375862734607277121394055064519674969245756 Verifying (2^(2^127)) % 191931200154874561262841813657816481627920801325769369 = 68989798066545733249246754983396020869954214592518662 Verifying (2^(2^127)) % 191983951926039282622453643539197609014463925252484369 = 97958328059656999804568015951825574551513310180672117 Verifying (2^(2^127)) % 192036702536990857023110525934395209885947426134099129 = 57502989442863596951607305398169962920918382075438075 received signal "SIGINT" mmff will exit once the current class is finished. press ^C again to exit immediately mmff will exit NOW! Code:
mmff v0.28 (64bit built) Compiletime options THREADS_PER_BLOCK 256 MORE_CLASSES enabled Runtime options GPU Sieving enabled GPUSievePrimes 500000 GPUSieveSize 32M bits GPUSieveProcessSize 16K bits WorkFile worktodo.txt Checkpoints enabled CheckpointDelay 120s StopAfterFactor disabled PrintMode full V5UserID (none) ComputerID (none) WARNING, no GPUProgressHeader specified in mmff.ini, using default GPUProgressHeader " class | raw cand. | time | ETA | raw rate | SievePrimes" WARNING, no GPUProgressFormat specified in mmff.ini, using default GPUProgressFormat "%C/4620 | %n | %ts | %e | %rM/s | %s" TimeStampInResults no CUDA version info binary compiled for CUDA 11.20 CUDA runtime version 11.20 CUDA driver version 11.20 CUDA device info name GeForce GTX 1660 Ti compute capability 7.5 maximum threads per block 1024 number of mutliprocessors 24 (unknown number of shader cores) clock rate 1590MHz WARNING: ignoring line 1 in "worktodo.txt"! Reason: doesn't begin with Factor= got assignment: MM107, k range 41400000000000 to 41500000000000 (154-bit factors) Starting trial factoring of MM107 in k range: 41400G to 41500G (154-bit factors) k_min = 41400000000000 k_max = 41500000000000 Using GPU kernel "mfaktc_barrett160_M107gs" Verifying (2^(2^107)) % 13435068670193779240929580104031093912799413681 = 11943755078920637255837466212346786801214623286 class | raw cand. | time | ETA | raw rate | SievePrimes 0/4620 | 21.66M | 0.031s | n.a. | 698.70M/s | 500277 Verifying (2^(2^107)) % 13435068674693228987403666670879552138089175391 = 10351997845221972775324276802600874943890505684 5/4620 | 21.66M | 0.031s | n.a. | 698.70M/s | 500277 ... |
![]() |
![]() |
![]() |
#376 | |||
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×3×11×101 Posts |
![]() Quote:
Quote:
Quote:
Code:
mmff v0.28 (64bit built) Compiletime options THREADS_PER_BLOCK 256 MORE_CLASSES enabled Runtime options GPU Sieving enabled WARNING: Cannot read GPUSievePrimes from mmff.ini, using default value (82486) GPUSievePrimes depends on worktodo entry GPUSieveSize 2047M bits GPUSieveProcessSize 16K bits WorkFile worktodo.txt Checkpoints enabled CheckpointDelay 300s StopAfterFactor disabled PrintMode full V5UserID kriesel ComputerID emu/gtx1650 TimeStampInResults yes CUDA version info binary compiled for CUDA 10.10 CUDA runtime version 10.10 CUDA driver version 10.20 CUDA device info name GeForce GTX 1650 compute capability 7.5 maximum threads per block 1024 number of mutliprocessors 14 (unknown number of shader cores) clock rate 1710MHz Last fiddled with by kriesel on 2021-03-31 at 17:05 |
|||
![]() |
![]() |
![]() |
#377 |
"Dylan"
Mar 2017
24×37 Posts |
![]() |
![]() |
![]() |
![]() |
#378 | |
Banned
"Luigi"
Aug 2002
Team Italia
484310 Posts |
![]() Quote:
![]() |
|
![]() |
![]() |
![]() |
#379 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×3×11×101 Posts |
![]()
Please update http://www.doublemersennes.org/download.php for the newer binaries posted recently in this thread.
If I read it correctly, this thread has CUDA10.1 and 11.2, while doublemersennes has only up to CUDA8 and no enlarged GPUSieveSize. Last fiddled with by kriesel on 2021-05-09 at 18:56 |
![]() |
![]() |
![]() |
#380 | |
Banned
"Luigi"
Aug 2002
Team Italia
29×167 Posts |
![]() Quote:
There are very few participants to this subproject, and no one complained (hard) until now... ![]() |
|
![]() |
![]() |
![]() |
#381 |
I moo ablest echo power!
May 2013
70916 Posts |
![]()
Any chance someone could make a Windows build using a recent CUDA for CC8.6? The CC7.5 version throws an error:
Code:
ERROR: cudaGetLastError() returned 209: no kernel image is available for execution on the device |
![]() |
![]() |
![]() |
#382 | |||
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1A0A16 Posts |
![]() Quote:
Quote:
Quote:
2185 / 2 /(2127-1) is ~257 which is 144,115,188,075,855,872: going to 288.23E15 (186 bits) would be roughly 6 x 144.12 = 865 days = 2.37 years on a single GTX1650. To go to 187, 4.74 years more; to 188, 9.5 years more; total 14.24 GTX1650-years, or about 5 RTX2080-years. Four GTX1650 plus 2 RTX2080 could get there in ~1.5 years, if no factor is found before then. (I estimate factor odds from 185 through 188 as 1.6%.) Other supported double mersennes have a little more bits margin remaining in mmff v0.28. Code:
MMp URL k max done ~bits kernels’ max bits bits left MM31 http://www.doublemersennes.org/mm31.php 450.E15 90.64 89, 96 5.36 MM61 http://www.doublemersennes.org/mm61.php 230.E15 119.67 108, 120, 128 8.33 MM89 http://www.doublemersennes.org/mm89.php 54.E15 145.58 128, 140, 152, 160 14.42 MM107 http://www.doublemersennes.org/mm107.php 10.E15 161.15 152, 160, 172 10.85 MM127 http://www.doublemersennes.org/mm127.php 145.E15 185.01 183, 185, 188 2.99 Last fiddled with by kriesel on 2021-09-12 at 11:20 |
|||
![]() |
![]() |
![]() |
#383 |
Bemusing Prompter
"Danny"
Dec 2002
California
2·1,237 Posts |
![]()
I agree that it's time to add some new kernels for MM127. Someone could easily reach the limit in a few months using multiple GPUs if they're dedicated enough. We should extend the limit to at least 200 bits because there are quite a few people who are eager to solve the Catalan–Mersenne conjecture.
On a side note, someone has modified mmff to trial factor Wagstaff numbers with Mersenne prime exponents: https://mersenneforum.org/showpost.p...05&postcount=9 Last fiddled with by ixfd64 on 2022-05-25 at 21:18 |
![]() |
![]() |
![]() |
#385 | |
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
990310 Posts |
![]() Quote:
Check out v 0.28 branch. It might be possible that only a few changes are needed to the middle level code. The low level code has been written and tested, 90% of the work is done. A word of caution for those who would like to extend for another 32-bits: I was looking at that opportunity back then and there were not enough registers to continue in the same paradigm. Maybe someone could implement Karatsuba for kernels of next size; so iirc it was not yet Karatsuba but simply "school long multiplication in quasi digits of the operationally optimal size". Karatsuba may be faster but also may need even more spare registers. Another thought is that now, 7 years later, some high end cards will fit the required specs. Yet another thought is that someone might want to port it all from CUDA to universal. |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Mersenne trial division implementation | mathPuzzles | Math | 8 | 2017-04-21 07:21 |
trial division over a factor base | Peter Hackman | Factoring | 7 | 2009-10-26 18:27 |
P95 trial division strategy | SPWorley | Math | 8 | 2009-08-24 23:26 |
Trial division software for Mersenne | SPWorley | Factoring | 7 | 2009-08-16 00:23 |
Need GMP trial-division timings | ewmayer | Factoring | 7 | 2008-12-11 22:12 |