View Single Post
Old 2020-11-27, 22:51   #1
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22×5×172 Posts
Default Mfactor-specific thread

This is a reference thread specific to Ernst Mayer's mfactor program (not to be confused with Peter Montgomery's factor program). And if/when it matters, to the CPU-oriented builds of it. Please comment in the reference material discussion thread, not here. (Posts here may be incorporated with attribution, moved, or removed without recourse.)

In most cases, GIMPS trial factoring should be performed on GPUs using mfaktc or mfakto, or for special purposes, special programs such as mmff on NVIDIA GPUs. Mfactor comes into the picture for special cases they won't handle, such as trial factoring Mersenne numbers beyond those programs' limits.

This whole thread is a draft in progress and some posts may be mostly a placeholder at the moment or in portions.
Please note that Ernst has described this software as "experimental" and "unsupported". Expect some rough edges.
There's a forum thread about Mfactor here, which includes links to previous threads.


Mfactor does TF on CPU. Limits depend on executable type, compilation to support some number of words width, and CPU type. (There was also a GPU capability. I've not compiled that, as it had lower performance and no higher exponent coverage than mfakto or mfaktc. Use them instead.) See the bits table attached.


(Getting started section work in progress)
Initial install

I usually set up a Windows subfolder and desktop shortcut early, to make testing out things easy.
Tastes differ. This is what I typically use, everything inside the "", adapt to your tastes and system directory structure etc:
Shortcut name: "cmd in mfactor"
Target: "C:\Windows\System32\cmd.exe /k"
Start in directory: "C:\Users\kriesel\Documents\mfactor"
Download a suitable program and put it in that directory.
If on Windows, it also needs libwinpthread-1.dll, even though it's single-threaded, because of how it was compiled. See second attachment on this post.

I usually start by a simple run to emit a help file. Note Ernst has disavowed Mfactor's help accuracy. View its content as hints to what may work, not promises. In particular, checkpoint files are disabled since the addition of multithreading in the code. Also note it sometimes gets confused about the OS; this was run on Windows 10 Home x64, but built on Win 7 using msys2 which supports Linux style build tools on the Windows OS. Mfactor and Mlucas mistake that build environment for Linux.
Code:
mfactor-base-1w -h >help.txt
Code:
mfactor v2020-03-05
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 8.2.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
INFO: MLUCAS_PATH is set to ""
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
 Mfactor command line options ...
 <CR>        Default mode: prompts for manual keyboard entry

 -h          Prints this help file and exits

 -m {num}    Trial-factor the Mersenne number M(num) = 2^num - 1.

 -mm {num}   Trial-factor the double-Mersenne number M(M(num)) = 2^(2^num) - 1.

 -f {num}    Trial-factor the Fermat number F(num) = 2^(2^num) + 1.

 -file {string}    Name of checkpoint file (needed for restart-from-interrupt)

 -bmin {num} Log2(minimum factor to try), in floating double form.
 If > 10^9 its whole-number part is taken as the kmin value instead.

 -bmax {num} Log2(maximum factor to try), in floating double form.
 If > 10^9 its whole-number part is taken as the kmax value instead.

 -kmin {num}  Lowest factor K value to be tried in each pass ( > 0).

 -kmax {num} Highest factor K value to be tried in each pass ( < 2^64).

 -passmin {num}  Current factoring pass (0-15).

 -passmax {num}  Maximum pass for the run (0-15).
Choose the fewest-word build that fits the requirements of the task at hand. See the attached mfactor bits table.pdf

Test for basic operation:
If building yourself, or downloading someone else's build, test the resulting build(s) such as by finding the small known factors of MM31.
If unfamiliar with Mfactor, try it out on something simple and fast, such as the following.

A simple example: command line
Code:
mfactor-base-1w -bmin 1 -bmax 48 -mm 31
gives output to console
Code:
mfactor v2020-03-05
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 8.2.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
'printf' is not recognized as an internal or external command,
operable program or batch file.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
Mfactor build flags:
TRYQ = 4
NUM_SIEVING_PRIME = 100000
TF_CLASSES = 60
MULH64_FAST = true
FACTOR_STANDALONE = true
NOBRANCH = true
USE_128x96 = 1
Mfactor self-tests:
Apr2015 mi64_div quicktest passes.
mi64_div quicktest passes.
Base-2 PRP test of M127 passed: Time = 00:00:00.000
Base-2 PRP test of M607 passed: Time = 00:00:00.000
Base-3 PRP test of M607 passed: Time = 00:00:00.002
Base-2 PRP test of M4423 passed: Time = 00:00:00.078
Base-3 PRP test of M4423 passed: Time = 00:00:00.318
Testing 64-bit Fermat factors...
Testing 128-bit Fermat factors...
Testing 192-bit Fermat factors...
Testing 256-bit Fermat factors...
Testing > 256-bit Fermat factors...
Testing 63-bit factors...
Testing 64-bit factors...
Testing 65-bit factors...
Testing 96-bit factors...
Factoring self-tests completed successfully.
p mod 60 = 7
INFO: No factoring savefile t31 found ... starting from scratch.
Allocated 255255 words in master template, 4255 in per-pass bit_map [16 x that in bit_atlas]
Generating difference table of first 100000 small primes
Using first 100000 odd primes; max gap = 114
max sieving prime = 1299721
Searching in the interval k=[0, 16336320], i.e. q=[1.000000e+000, 7.016396e+016]
Each of  16 (p mod 60) passes will consist of 1 intervals of length 272272
2949120 ones bits of 16336320 in master sieve template.
TRYQ = 4, max sieving prime = 1299721
Time to set up sieve = 00:00:00.044
pass = 0
pass = 1
pass = 2
pass = 3
pass = 4
pass = 5
pass = 6
pass = 7
pass = 8
pass = 9
pass = 10
pass = 11
        Factor with k = 68745. This factor is a probable prime.

pass = 12
pass = 13
pass = 14
pass = 15
MM(31) has 1 factors in range k = [0, 16336320], passes 0-15
Performed 657696 trial divides
Clocks = 00:00:01.302
and appending to file results.txt in the working directory, of:
Code:
Searching in the interval k=[0, 16336320], i.e. q=[1.000000e+000, 7.016396e+016]
Each of  16 (p mod 60) passes will consist of 1 intervals of length 272272

    Factor with k = 68745. This factor is a probable prime.
M(31) has 1 factors in range k = [0, 16336320], passes 0-15
Note the last result line above is incorrect. It should have indicated MM(31), or M(2147483647), not M(31). That occurs only in results.txt, not in output to the console. It only affects -mm x results, not -m x. A workaround for that rare bug is to use "-m 2147483647" instead, as in the following command line:
Code:
mfactor-base-1w -bmin 58 -bmax 60 -m 2147483647
That yields no factor. Results.txt content contributed by that run was:
Code:
Searching in the interval k=[65345280, 277717440], i.e. q=[2.806558e+017, 1.192787e+018]
Each of  16 (p mod 60) passes will consist of 13 intervals of length 272272
M(2147483647) has 0 factors in range k = [65345280, 277717440], passes 0-15
Single threaded:
Let's suppose we wanted to run one more bit level for M9999999943. That's nearly 1010, too big for mfaktc or mfakto because the exponent is > 2^32. Normally factoring such large exponents is discouraged. But it will be used here because large exponents and low bit levels make fast examples.
Look up the exponent on mersenne.ca: https://www.mersenne.ca/exponent/9999999943 shows it has been trial factored up to 71 bits. We want to run the entire bit level, from 71 to 72 bits.
(Reserve the assignment to avoid colliding with someone else that's following the rules)
Check the bits table to determine the smallest-word-length that's eligible, because that will be fastest. The exponent fits in one word. The max bit level is also well within the one-word capability. The max k can be determined from f=2kp+1 ~ 272. k=floor((f-1)/2/p), in this case 236,118,325,489 which is ~237.78<264. We have a winner.
For conceptual simplicity, let's run a 16-pass single threaded version, and the base x86_64 that ought run on any modern Intel 64-bit CPU.
Then the command line for Mfactor is
Code:
mfactor-base-1w.exe -m 9999999943 -bmin 71 -bmax 72
and the run on a Windows 10 x64 Home i7-1165G7 laptop indicates ~33 minutes run time:
Code:
mfactor v2020-03-05
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 8.2.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
'printf' is not recognized as an internal or external command,
operable program or batch file.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
Mfactor build flags:
TRYQ = 4
NUM_SIEVING_PRIME = 100000
TF_CLASSES = 60
MULH64_FAST = true
FACTOR_STANDALONE = true
NOBRANCH = true
USE_128x96 = 1
Mfactor self-tests:
Apr2015 mi64_div quicktest passes.
mi64_div quicktest passes.
Base-2 PRP test of M127 passed: Time = 00:00:00.000
Base-2 PRP test of M607 passed: Time = 00:00:00.000
Base-3 PRP test of M607 passed: Time = 00:00:00.002
Base-2 PRP test of M4423 passed: Time = 00:00:00.067
Base-3 PRP test of M4423 passed: Time = 00:00:00.232
Testing 64-bit Fermat factors...
Testing 128-bit Fermat factors...
Testing 192-bit Fermat factors...
Testing 256-bit Fermat factors...
Testing > 256-bit Fermat factors...
Testing 63-bit factors...
Testing 64-bit factors...
Testing 65-bit factors...
Testing 96-bit factors...
Factoring self-tests completed successfully.
INFO: No factoring savefile t9999999943 found ... starting from scratch.
Allocated 255255 words in master template, 4255 in per-pass bit_map [16 x that in bit_atlas]
Generating difference table of first 100000 small primes
Using first 100000 odd primes; max gap = 114
max sieving prime = 1299721
Searching in the interval k=[118046248320, 236125169280], i.e. q=[2.360925e+021, 4.722503e+021]
Each of  16 (p mod 60) passes will consist of 7228 intervals of length 272272
2949120 ones bits of 16336320 in master sieve template.
TRYQ = 4, max sieving prime = 1299721
Time to set up sieve = 00:00:00.049
pass = 0.............[k = 144999333545].............[k = 171951897305].............[k = 198904175645].............[k = 225859273565]....
pass = 1.............[k = 145002567128].............[k = 171956644088].............[k = 198908961008].............[k = 225862152428]....
pass = 2.............[k = 145000796832].............[k = 171952301052].............[k = 198906699432].............[k = 225857822232]....
pass = 3.............[k = 145001975837].............[k = 171958146797].............[k = 198910531997].............[k = 225866400137]....
pass = 4.............[k = 144998753900].............[k = 171950858120].............[k = 198903586280].............[k = 225856999700]....
pass = 5.............[k = 144997601721].............[k = 171951291861].............[k = 198906318861].............[k = 225859767621]....
pass = 6.............[k = 145001968652].............[k = 171957272552].............[k = 198908452292].............[k = 225862681712]....
pass = 7.............[k = 145000102653].............[k = 171952697133].............[k = 198907395813].............[k = 225862060713]....
pass = 8.............[k = 145000255236].............[k = 171956548476].............[k = 198911198436].............[k = 225865900656]....
pass = 9.............[k = 145001376821].............[k = 171956349581].............[k = 198910664141].............[k = 225861936521]....
pass = 10.............[k = 145000559625].............[k = 171956240805].............[k = 198908430525].............[k = 225860873745]....
pass = 11.............[k = 145001525568].............[k = 171954862548].............[k = 198907540188].............[k = 225860573928]....
pass = 12.............[k = 145002299813].............[k = 171955457753].............[k = 198907208213].............[k = 225862589693]....
pass = 13.............[k = 145000528136].............[k = 171954980936].............[k = 198909000656].............[k = 225863660096]....
pass = 14.............[k = 145001672757].............[k = 171953367597].............[k = 198906481317].............[k = 225859766157]....
pass = 15.............[k = 144999595200].............[k = 171951504420].............[k = 198905326260].............[k = 225859744080]....
M(9999999943) has 0 factors in range k = [118046248320, 236125169280], passes 0-15
Performed 4703862502 trial divides
Clocks = 00:33:14.472
Those dots and k-values indicate progress and can be used to continue from near a stopping point if a run or system crashes, to avoid repeating the work from the beginning.


Verifying a known factor
Ok, suppose we wanted to just verify one of the factors found, on that exponent, using TF in Mfactor. Either confirming the factor, or testing mfactor.
We could run the whole bit level containing the factor. But that can take a long time.
We can instead of specifying min and max bit levels with -bmin and -bmax, specify k min and max values with -kmin and -kmax. Let's try to verify the 67 bit factor which has k=8408497208. Even if we specify kmin and kmax very close together, IIRC it will run a range corresponding to the size of Mfactor's small-primes sieve, 16336320. This is pretty quick, but not nearly as fast as the server's verification.
Code:
mfactor v2020-03-05
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 8.2.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
'printf' is not recognized as an internal or external command,
operable program or batch file.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
Mfactor build flags:
TRYQ = 4
NUM_SIEVING_PRIME = 100000
TF_CLASSES = 60
MULH64_FAST = true
FACTOR_STANDALONE = true
NOBRANCH = true
USE_128x96 = 1
Mfactor self-tests:
Apr2015 mi64_div quicktest passes.
mi64_div quicktest passes.
Base-2 PRP test of M127 passed: Time = 00:00:00.000
Base-2 PRP test of M607 passed: Time = 00:00:00.000
Base-3 PRP test of M607 passed: Time = 00:00:00.002
Base-2 PRP test of M4423 passed: Time = 00:00:00.077
Base-3 PRP test of M4423 passed: Time = 00:00:00.240
Testing 64-bit Fermat factors...
Testing 128-bit Fermat factors...
Testing 192-bit Fermat factors...
Testing 256-bit Fermat factors...
Testing > 256-bit Fermat factors...
Testing 63-bit factors...
Testing 64-bit factors...
Testing 65-bit factors...
Testing 96-bit factors...
Factoring self-tests completed successfully.
INFO: No factoring savefile t9999999943 found ... starting from scratch.
Allocated 255255 words in master template, 4255 in per-pass bit_map [16 x that in bit_atlas]
Generating difference table of first 100000 small primes
Using first 100000 odd primes; max gap = 114
max sieving prime = 1299721
Searching in the interval k=[8396868480, 8413204800], i.e. q=[1.679374e+020, 1.682641e+020]
Each of  16 (p mod 60) passes will consist of 1 intervals of length 272272
2949120 ones bits of 16336320 in master sieve template.
TRYQ = 4, max sieving prime = 1299721
Time to set up sieve = 00:00:00.043
pass = 0
pass = 1
        Factor with k = 8408497208. This factor is a probable prime.

pass = 2
pass = 3
pass = 4
pass = 5
pass = 6
pass = 7
pass = 8
pass = 9
pass = 10
pass = 11
pass = 12
pass = 13
pass = 14
pass = 15
M(9999999943) has 1 factors in range k = [8396868480, 8413204800], passes 0-15
Performed 650708 trial divides
Clocks = 00:00:01.118
Note a few lines above that the program expanded the k range. If we knew which pass it would occur in, we could make it considerably faster. Using the many-classes version would make it still faster.

We could also consider trying verifying both factors, using the product of the 2 k's.
k = 8408497208 * 41901698172 = 352330312089720703776. That is over 68 bits. Still fits in 1-word executable's exponent and bits limits. But there's a third limit, on k, of 64 bits;
Code:
mfactor-base-1w.exe -m 9999999943 -kmin 352330312089720703775 -kmax 352330312089720703777
fails because the k values exceed 64 bits, which is a global constraint independent of number of words.
So in this case, verify the factors separately.
Code:
mfactor-base-1w.exe -m 9999999943 -kmin 41901698171 -kmax 41901698173
Poor man's multithreaded:
Each process will make entries in results.txt upon completion. It is up to the user to examine them and determine whether a factor was found by any of the processes.

Linux multithreaded
...more someday...

)

Some Mfactor notes:
  1. The effective minimum kmax is set to 16,336,320 by the size of the small-primes sieve. https://www.mersenneforum.org/showpo...6&postcount=16
  2. NWord is limited to 64,000,000 bits exponent and factor. So it could theoretically be used on MM57885161 but not MM74207281 and up.
  3. The help output of the program is considerable. https://www.mersenneforum.org/showpo...8&postcount=24 Ernst cautioned it may not be current.
  4. Savefiles are not implemented, although the program emits messages about them.
  5. It runs controlled by command line parameters. No ini files, config files, worktodo files, etc.
  6. It outputs to stdout and stderr. Any logging is because of redirection or tee use, on command lines, in batch files, or shell scripts.
  7. Use of too low a bmax will produce errors. bmax<2p produces kmax < 1. In some cases bmax < 227p will produce such an error.
    Code:
    ERROR: at line 3380 of file ../factor.c
    Assertion failed: 2.k.p overflows!
    or
    Code:
    ERROR: at line 1697 of file ../factor.c
    Assertion failed: Something went wrong with the computation of kmax ... possibly your bmax implies kmax > 64-bit?
  8. Don't mix -bmin and -kmax, or -kmin and -bmax, on the same command line. It won't accept that.
  9. I suggest all Mfactor users keep logs by redirecting stdout to a file, & perhaps stderr to another, because you might need to support your claim of having run an entire bit level. These can be large, but as plain text are highly compressible.

Naming convention is as follows, for the builds posted in this thread:
Mfactor-<arch>-<x>w[-tfc][-mt], where:
<arch> is base, for any x86-64, ...
<x> is number of words, or variable if n is present;
if -tfc is present, it's the 960-pass out of 4620 classes variant, otherwise it's 16-pass out of 60 classes;
if -mt is present, it's a multithreaded build, otherwise it's single-threaded.


Table of contents for Mfactor-specific thread (this thread)
  1. Intro and table of contents (this post)
  2. 16-pass Windows builds https://www.mersenneforum.org/showpo...27&postcount=2
  3. 960-pass Windows builds https://www.mersenneforum.org/showpo...28&postcount=3
  4. Linux builds https://www.mersenneforum.org/showpo...29&postcount=4
  5. Poor-man's multithreading approximation https://www.mersenneforum.org/showpo...33&postcount=5
  6. Wish list https://www.mersenneforum.org/showpo...88&postcount=6
  7. Bug list https://www.mersenneforum.org/showpo...63&postcount=7
  8. etc

For more background, see https://www.mersenneforum.org/showthread.php?t=25009 and other content available from links there


Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1
Attached Files
File Type: pdf mfactor bits table.pdf (10.9 KB, 87 views)
File Type: zip libwinpthread-1.zip (23.6 KB, 9 views)

Last fiddled with by kriesel on 2021-10-03 at 17:31 Reason: added initial install, examples to getting started section
kriesel is offline   Reply With Quote