mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Operazione Doppi Mersennes

Reply
 
Thread Tools
Old 2021-03-31, 15:29   #375
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

2×293 Posts
Default

Attached is a CUDA 11.2 binary of mmff compiled on an Arch Linux system, using the cleaned up source posted by Fan Ming here. These should work on any Linux system with the latest Nvidia drivers.

This one does seem to work with MM107 and MM127, using the examples posted by Fan Ming in posts 363 and 364 with the flag -v 3.

MM127:

Code:
mmff v0.28 (64bit built)

Compiletime options
  THREADS_PER_BLOCK         256
  MORE_CLASSES              enabled

Runtime options
  GPU Sieving               enabled
  GPUSievePrimes            500000
  GPUSieveSize              32M bits
  GPUSieveProcessSize       16K bits
  WorkFile                  worktodo.txt
  Checkpoints               enabled
  CheckpointDelay           120s
  StopAfterFactor           disabled
  PrintMode                 full
  V5UserID                  (none)
  ComputerID                (none)
WARNING, no GPUProgressHeader specified in mmff.ini, using default
  GPUProgressHeader         "    class |  raw cand. |    time |    ETA |  raw rate | SievePrimes"
WARNING, no GPUProgressFormat specified in mmff.ini, using default
  GPUProgressFormat            "%C/4620 |    %n | %ts | %e | %rM/s |     %s"
  TimeStampInResults        no

CUDA version info
  binary compiled for CUDA  11.20
  CUDA runtime version      11.20
  CUDA driver version       11.20

CUDA device info
  name                      GeForce GTX 1660 Ti
  compute capability        7.5
  maximum threads per block 1024
  number of mutliprocessors 24 (unknown number of shader cores)
  clock rate                1590MHz

got assignment: MM127, k range 562949953421312 to 1125899906842623 (178-bit factors)
Starting trial factoring of MM127 in k range: 562949953421312 to 1125899906842623 (178-bit factors)
 k_min = 562949953421312
 k_max = 1125899906842623
Using GPU kernel "mfaktc_barrett183_M127gs"
Verifying (2^(2^127)) % 191561944857917697129840166812120120096271125295021529 = 158757927754760480688654173499199469295287057656270356
Verifying (2^(2^127)) % 191614694258348779445950559282708892489982390750176689 = 33662559093375555778002927546058307399215184129861713
Verifying (2^(2^127)) % 191667446858012590842648103696906711574948849832403649 = 58322051460264670631592692291826830098619657940589851
Verifying (2^(2^127)) % 191720197063361195168223175182300315372061913389308569 = 161044083194471348086645110435896183576905312890737395
Verifying (2^(2^127)) % 191772947100494614230101526639209315731354679296043129 = 62574822488929725322867766087082605619226720248185889
Verifying (2^(2^127)) % 191825698154779667550033876773029890149243076788387249 = 178143699218529778276359107673564439346000111582072990
Verifying (2^(2^127)) % 191878449278237320417654597759685254765861316305100489 = 167970576150375862734607277121394055064519674969245756
Verifying (2^(2^127)) % 191931200154874561262841813657816481627920801325769369 = 68989798066545733249246754983396020869954214592518662
Verifying (2^(2^127)) % 191983951926039282622453643539197609014463925252484369 = 97958328059656999804568015951825574551513310180672117
Verifying (2^(2^127)) % 192036702536990857023110525934395209885947426134099129 = 57502989442863596951607305398169962920918382075438075
received signal "SIGINT"
mmff will exit once the current class is finished.
press ^C again to exit immediately
mmff will exit NOW!
MM107:

Code:
mmff v0.28 (64bit built)

Compiletime options
  THREADS_PER_BLOCK         256
  MORE_CLASSES              enabled

Runtime options
  GPU Sieving               enabled
  GPUSievePrimes            500000
  GPUSieveSize              32M bits
  GPUSieveProcessSize       16K bits
  WorkFile                  worktodo.txt
  Checkpoints               enabled
  CheckpointDelay           120s
  StopAfterFactor           disabled
  PrintMode                 full
  V5UserID                  (none)
  ComputerID                (none)
WARNING, no GPUProgressHeader specified in mmff.ini, using default
  GPUProgressHeader         "    class |  raw cand. |    time |    ETA |  raw rate | SievePrimes"
WARNING, no GPUProgressFormat specified in mmff.ini, using default
  GPUProgressFormat            "%C/4620 |    %n | %ts | %e | %rM/s |     %s"
  TimeStampInResults        no

CUDA version info
  binary compiled for CUDA  11.20
  CUDA runtime version      11.20
  CUDA driver version       11.20

CUDA device info
  name                      GeForce GTX 1660 Ti
  compute capability        7.5
  maximum threads per block 1024
  number of mutliprocessors 24 (unknown number of shader cores)
  clock rate                1590MHz

WARNING: ignoring line 1 in "worktodo.txt"! Reason: doesn't begin with Factor=
got assignment: MM107, k range 41400000000000 to 41500000000000 (154-bit factors)
Starting trial factoring of MM107 in k range: 41400G to 41500G (154-bit factors)
 k_min = 41400000000000
 k_max = 41500000000000
Using GPU kernel "mfaktc_barrett160_M107gs"
Verifying (2^(2^107)) % 13435068670193779240929580104031093912799413681 = 11943755078920637255837466212346786801214623286
    class |  raw cand. |    time |    ETA |  raw rate | SievePrimes
   0/4620 |     21.66M |  0.031s |   n.a. | 698.70M/s |      500277
Verifying (2^(2^107)) % 13435068674693228987403666670879552138089175391 = 10351997845221972775324276802600874943890505684
   5/4620 |     21.66M |  0.031s |   n.a. | 698.70M/s |      500277
...
I have also attached the full logs from both runs. Why the CUDA 10.1 made executables fail, I'm not sure.
Attached Files
File Type: zip mmff_cuda_11-2.zip (3.73 MB, 147 views)
File Type: txt test.txt (182.9 KB, 130 views)
File Type: txt test2.txt (2.9 KB, 154 views)
Dylan14 is offline   Reply With Quote
Old 2021-03-31, 16:10   #376
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

580510 Posts
Default

Quote:
Originally Posted by Dylan14 View Post
...got assignment: MM127, k range 562949953421312 to 1125899906842623 (178-bit factors)
Starting trial factoring of MM127 in k range: 562949953421312 to 1125899906842623 (178-bit factors)
k_min = 562949953421312
k_max = 1125899906842623
Using GPU kernel "mfaktc_barrett183_M127gs"

...[/code]
Cool. Did the posted Arch build include the expanded 2047M GpuSieveSize? Might want to aim higher, in your test run, for ranges of k and selection of kernel likely to be run in the future, since MM127 TF to 185 bits was completed months ago.
Quote:
Originally Posted by kriesel View Post
[Fri Sep 04 18:24:46 2020]
UID: kriesel/emu/gtx1650, no factor for MM127 in k range: 140000000000000000 to 144115188075855871 (185-bit factors) [mmff 0.28 mfaktc_barrett185_M127gs]

145P ETA <7 days
Quote:
Originally Posted by kriesel View Post
[Thu Sep 10 22:15:10 2020]
UID: kriesel/emu/gtx1650, no factor for MM127 in k range: 144115188075855872 to 145000000000000000 (186-bit factors) [mmff 0.28 mfaktc_barrett188_M127gs]
Info header was
Code:
mmff v0.28 (64bit built)

Compiletime options
  THREADS_PER_BLOCK         256
  MORE_CLASSES              enabled

Runtime options
  GPU Sieving               enabled
WARNING: Cannot read GPUSievePrimes from mmff.ini, using default value (82486)
  GPUSievePrimes            depends on worktodo entry
  GPUSieveSize              2047M bits
  GPUSieveProcessSize       16K bits
  WorkFile                  worktodo.txt
  Checkpoints               enabled
  CheckpointDelay           300s
  StopAfterFactor           disabled
  PrintMode                 full
  V5UserID                  kriesel
  ComputerID                emu/gtx1650
  TimeStampInResults        yes

CUDA version info
  binary compiled for CUDA  10.10
  CUDA runtime version      10.10
  CUDA driver version       10.20

CUDA device info
  name                      GeForce GTX 1650
  compute capability        7.5
  maximum threads per block 1024
  number of mutliprocessors 14 (unknown number of shader cores)
  clock rate                1710MHz
Edut: maibe ficks tha mipselling tu.

Last fiddled with by kriesel on 2021-03-31 at 17:05
kriesel is offline   Reply With Quote
Old 2021-03-31, 16:40   #377
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

2·293 Posts
Default

Quote:
Originally Posted by kriesel View Post
Cool. Did the posted Arch build include the expanded 2047M GpuSieveSize?
No, it is limited to 128M bits. I could easily fix that and put a updated build.
Dylan14 is offline   Reply With Quote
Old 2021-03-31, 20:51   #378
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

2×3×5×7×23 Posts
Default

Quote:
Originally Posted by Dylan14 View Post
No, it is limited to 128M bits. I could easily fix that and put a updated build.
Would you mind sharing the updated source code to the FermatSearch community (or at least to me?) I have my code happily running with Ubuntu and the 11.1 drivers, but no PrimeGaps speedup ...
ET_ is offline   Reply With Quote
Old 2021-05-09, 18:51   #379
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

33·5·43 Posts
Default

Please update http://www.doublemersennes.org/download.php for the newer binaries posted recently in this thread.
If I read it correctly, this thread has CUDA10.1 and 11.2, while doublemersennes has only up to CUDA8 and no enlarged GPUSieveSize.

Last fiddled with by kriesel on 2021-05-09 at 18:56
kriesel is offline   Reply With Quote
Old 2021-05-10, 16:58   #380
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

483010 Posts
Default

Quote:
Originally Posted by kriesel View Post
Please update http://www.doublemersennes.org/download.php for the newer binaries posted recently in this thread.
If I read it correctly, this thread has CUDA10.1 and 11.2, while doublemersennes has only up to CUDA8 and no enlarged GPUSieveSize.
I will.
There are very few participants to this subproject, and no one complained (hard) until now...
ET_ is offline   Reply With Quote
Old 2021-08-10, 23:33   #381
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

110111110002 Posts
Default

Any chance someone could make a Windows build using a recent CUDA for CC8.6? The CC7.5 version throws an error:

Code:
ERROR: cudaGetLastError() returned 209: no kernel image is available for execution on the device
wombatman is offline   Reply With Quote
Old 2021-09-12, 11:18   #382
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

33×5×43 Posts
Default MM127 is on its last mmff kernel & 3 bits

Quote:
Originally Posted by Prime95 View Post
Today I'm releasing a beta version of a GPU factoring program for the double-Mersennes (MM61, MM89, MM107, MM127) and small Fermat numbers (F26 to F157). Sources attached. Nvidia cards with 2.0 compute capability is required.
Quote:
Originally Posted by Uncwilly View Post
I would love for this to be the end of the "MM127 is prime" speculation!
Quote:
Originally Posted by Gary View Post
With these changes, all 43 known factors within the range of mmff can be verified
...
Here is source with these changes and a CUDA 10.1 Linux binary that will hopefully run on Kepler or later (--gpu-architecture=compute_30). I included Serge's patch to print factors found in K*2^N+1 form. If you want factors in the old format, use output.c from the 0.28 release. I also fixed a few other misc things, and changed the version to 0.28.1 to identify this binary. I am not sure who the current owner of mmff is, but if I changed anything in a "bad" way please feel free to fix it and re-post.
If I read the v0.28 source correctly, TF on MM127 is supported up to factors 2188. MM127 has been trial factored to slightly over 2185. A rough estimate of time to complete TF up to the limit of v0.28 is 5 RTX2080-years. That limit could be reached within 2 years by application of multiple GPUs on different blocks in parallel.

2185 / 2 /(2127-1) is ~257 which is 144,115,188,075,855,872: going to 288.23E15 (186 bits) would be roughly 6 x 144.12 = 865 days = 2.37 years on a single GTX1650.
To go to 187, 4.74 years more; to 188, 9.5 years more; total 14.24 GTX1650-years, or about 5 RTX2080-years.
Four GTX1650 plus 2 RTX2080 could get there in ~1.5 years, if no factor is found before then. (I estimate factor odds from 185 through 188 as 1.6%.)

Other supported double mersennes have a little more bits margin remaining in mmff v0.28.
Code:
MMp        URL                                  k max done   ~bits   kernels’ max bits    bits left
MM31   http://www.doublemersennes.org/mm31.php    450.E15    90.64    89, 96                5.36    
MM61   http://www.doublemersennes.org/mm61.php    230.E15   119.67    108, 120, 128         8.33    
MM89   http://www.doublemersennes.org/mm89.php     54.E15   145.58    128, 140, 152, 160   14.42    
MM107  http://www.doublemersennes.org/mm107.php    10.E15   161.15    152, 160, 172        10.85    
MM127  http://www.doublemersennes.org/mm127.php   145.E15   185.01    183, 185, 188         2.99
Of these, MM31 has known factors.

Last fiddled with by kriesel on 2021-09-12 at 11:20
kriesel is offline   Reply With Quote
Old 2021-09-30, 19:31   #383
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

96E16 Posts
Default

I agree that it's time to add some new kernels for MM127. Someone could easily reach the limit in a few months using multiple GPUs if they're dedicated enough. We should extend the limit to at least 200 bits because there are quite a few people who are eager to solve the Catalan–Mersenne conjecture.

On a side note, there's now a version of mmff for trial factoring Wagstaff numbers with Mersenne prime exponents: https://mersenneforum.org/showpost.p...05&postcount=9
ixfd64 is offline   Reply With Quote
Old 2021-09-30, 20:50   #384
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

33·5·43 Posts
Default

After mmff there's the possibility of Ernst's Mfactor, which I think can take mm127 to 192 bits. Each bit level would be slow, even with many threads/classes in parallel on a manycore/HT system.
kriesel is offline   Reply With Quote
Old 2021-09-30, 21:48   #385
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

3×3,191 Posts
Arrow

Quote:
Originally Posted by Batalov View Post
Minor update -- v 0.28:

What's new:

The next set of 32 N values in k*2^N+1 Fermat factor testing is available.
The highest testable N is now 223, and the highest bitlevel is 252. Practically, because k<=2^45 are already tested, the highest usable N is 207 (was 175 is version 0.27), but double-checking may find something (known Fermat factors for N=217 (Suyama, 1980) and N=207 (Keller, 1984) are recovered as one of the QC tests).

As always, previous savefiles will not work with 0.28 unless the -nocheck argument is used.

All seven new kernels are thoroughly tested, but let me know if you will get errors anyway.

Keep the factors coming!
I did add yet another 32 bits, a few year after 2012. And I did run find a Fermat factor with my extended code, so I know that it works fine.

Check out v 0.28 branch. It might be possible that only a few changes are needed to the middle level code. The low level code has been written and tested, 90% of the work is done.

A word of caution for those who would like to extend for another 32-bits: I was looking at that opportunity back then and there were not enough registers to continue in the same paradigm. Maybe someone could implement Karatsuba for kernels of next size; so iirc it was not yet Karatsuba but simply "school long multiplication in quasi digits of the operationally optimal size". Karatsuba may be faster but also may need even more spare registers. Another thought is that now, 7 years later, some high end cards will fit the required specs. Yet another thought is that someone might want to port it all from CUDA to universal.
Batalov is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Mersenne trial division implementation mathPuzzles Math 8 2017-04-21 07:21
trial division over a factor base Peter Hackman Factoring 7 2009-10-26 18:27
P95 trial division strategy SPWorley Math 8 2009-08-24 23:26
Trial division software for Mersenne SPWorley Factoring 7 2009-08-16 00:23
Need GMP trial-division timings ewmayer Factoring 7 2008-12-11 22:12

All times are UTC. The time now is 23:00.


Mon Oct 25 23:00:57 UTC 2021 up 94 days, 17:29, 0 users, load averages: 2.95, 2.90, 2.56

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.