View Single Post
Old 2020-11-27, 23:04   #4
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·5·172 Posts
Default Linux builds

These were built on Ubuntu 18.04 / WSL on a Windows 10 Pro x64 i7-8750H system. See the mfactor bits table attached to post one of this thread. See also for an indication of what is possible on Knight's Landing and many threads on Linux, https://www.mersenneforum.org/showpo...&postcount=165

I believe based on comparing file dates and release dates these were created from source files released with Mlucas V19.0.

Build process for V19.0 was something like the following, for the base x64 1word single-thread build:
wget https://www.mersenneforum.org/mayer/...mlucas_v19.txz or https://www.mersenneforum.org/mayer/...lucas_v19.tbz2
tar (some options) filename to unzip
mv factor.c.txt factor.c
mkdir ./obj_mfac
cd ./obj_mfac
gcc -c -Os ../get*.c && rm get_preferred_fft_radix.o
gcc -c -Os ../imul_macro.c ../mi64.c ../qfloat.c ../rng_isaac.c ../two*c ../types.c ../util.c
gcc -c -Os -DFACTOR_STANDALONE -DTRYQ=4 ../factor.c ../get_cpuid.c
gcc -o Mfactor-base-1w *o -lm

For 2-word, many factor classes, multithreaded, after the preceding, something like:
gcc -c -Os -DUSE_THREADS -DFACTOR_STANDALONE -DTRYQ=4 -DTF_CLASSES=4620 -DP2WORD ../factor.c ../get_cpuid.c
gcc -c -Os -DUSE_THREADS ../threadpool.c ../util.c
gcc -o Mfactor-base-2w-tfc-mt *o -lm -lpthread

A test run of the 1w executable, on Ubuntu 18.04/WSL1/Win10HomeX64, i7-8750H CPU:
Code:
./Mfactor-base-1w -mm 31 -bmin 1 -bmax 48
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 7.4.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
Mfactor build flags:
TRYQ = 4
NUM_SIEVING_PRIME = 100000
TF_CLASSES = 60
MULH64_FAST = true
FACTOR_STANDALONE = true
NOBRANCH = true
USE_128x96 = 1
Mfactor self-tests:
Apr2015 mi64_div quicktest passes.
mi64_div quicktest passes.
Base-2 PRP test of M127 passed: Time = 00:00:00.000
Base-2 PRP test of M607 passed: Time = 00:00:00.000
Base-3 PRP test of M607 passed: Time = 00:00:00.000
Base-2 PRP test of M4423 passed: Time = 00:00:00.093
Base-3 PRP test of M4423 passed: Time = 00:00:00.375
Testing 64-bit Fermat factors...
Testing 128-bit Fermat factors...
Testing 192-bit Fermat factors...
Testing 256-bit Fermat factors...
Testing > 256-bit Fermat factors...
Testing 63-bit factors...
Testing 64-bit factors...
Testing 65-bit factors...
Testing 96-bit factors...
Factoring self-tests completed successfully.
p mod 60 = 7
INFO: Will write savefile t31 every 2^28 = 268435456 factor candidates tried.
INFO: No factoring savefile t31 found ... starting from scratch.
Allocated 255255 words in master template, 4255 in per-pass bit_map [16 x that in bit_atlas]
Generating difference table of first 100000 small primes
Using first 100000 odd primes; max gap = 114
max sieving prime = 1299721
Searching in the interval k=[0, 16336320], i.e. q=[1.000000e+00, 7.016396e+16]
Each of  16 (p mod 60) passes will consist of 1 intervals of length 272272
2949120 ones bits of 16336320 in master sieve template.
TRYQ = 4, max sieving prime = 1299721
Time to set up sieve = 00:00:00.078
pass = 0
pass = 1
pass = 2
pass = 3
pass = 4
pass = 5
pass = 6
pass = 7
pass = 8
pass = 9
pass = 10
pass = 11
        Factor with k = 68745. This factor is a probable prime.

pass = 12
pass = 13
pass = 14
pass = 15
MM(31) has 1 factors in range k = [0, 16336320], passes 0-15
Performed 657696 trial divides
Clocks = 00:00:01.390
There were at least 3 things wrong with the latter, "2word, multithreaded" executable I had first posted; two remain.
1) -nthread 2 or more, or omitted, which defaults to -nthread <# of hyperthreads>, fails. Only -nthread 1 worked. That may be due to an error in my build sequence for multithreaded compiles.
Code:
./Mfactor-base-2w-tfc-mt -mm 31 -bmin 1 -bmax 48 -nthread 2
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 7.4.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
INFO: System has 12 available processor cores.
NTHREADS = 2
Set affinity for the following 2 cores: 0.1.
Factor.c: Init threadpool of 2 threads
twopmodq96_q4: Setting up for as many as 6 threads...
ERROR: at line 1092 of file ../twopmodq80.c
Assertion failed: Multithreading currently only supported for SIMD builds!
Code:
/Mfactor-base-2w-tfc-mt -mm 31 -bmin 1 -bmax 48 -nthread 6
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 7.4.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
INFO: System has 12 available processor cores.
NTHREADS = 6
Set affinity for the following 6 cores: 0.1.2.3.4.5.
Factor.c: Init threadpool of 6 threads
twopmodq96_q4: Setting up for as many as 6 threads...
ERROR: at line 1092 of file ../twopmodq80.c
Assertion failed: Multithreading currently only supported for SIMD builds!
Code:
./Mfactor-base-2w-tfc-mt -mm 31 -bmin 1 -bmax 48 -nthread 7
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 7.4.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
INFO: System has 12 available processor cores.
NTHREADS = 7
Set affinity for the following 7 cores: 0.1.2.3.4.5.6.
Factor.c: Init threadpool of 7 threads
twopmodq96_q4: Setting up for as many as 6 threads...
ERROR: at line 482 of file ../twopmodq96.c
Assertion failed: Multithreading requires max_threads >= NTHREADS
Code:
./Mfactor-base-2w-tfc-mt -mm 31 -bmin 1 -bmax 48
 INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 7.4.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
INFO: System has 12 available processor cores.
NTHREADS = 12
Set affinity for the following 12 cores: 0.1.2.3.4.5.6.7.8.9.10.11.
Factor.c: Init threadpool of 12 threads
twopmodq96_q4: Setting up for as many as 6 threads...
ERROR: at line 482 of file ../twopmodq96.c
Assertion failed: Multithreading requires max_threads >= NTHREADS
2) Clocks are way off in a test with a single thread.
Code:
./Mfactor-base-2w-tfc-mt -mm 31 -bmin 1 -bmax 48 -nthread 1
 INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 7.4.0.
INFO: CPU supports SSE2 instruction set, but using scalar floating-point build.
INFO: Using inline-macro form of MUL_LOHI64.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
INFO: System has 12 available processor cores.
NTHREADS = 1
Set affinity for the following 1 cores: 0.
Factor.c: Init threadpool of 1 threads
twopmodq96_q4: Setting up for as many as 6 threads...
*Mfactor build flags:
TRYQ = 4
NUM_SIEVING_PRIME = 100000
TF_CLASSES = 4620
MULH64_FAST = true
FACTOR_STANDALONE = true
NOBRANCH = true
USE_128x96 = 1
Mfactor self-tests:
Apr2015 mi64_div quicktest passes.
mi64_div quicktest passes.
Base-2 PRP test of M127 passed: Time = 00:00:00.000
Base-2 PRP test of M607 passed: Time = 00:00:00.000
Base-3 PRP test of M607 passed: Time = 04:20:25.000
Base-2 PRP test of M4423 passed: Time = 39:03:45.000
Base-3 PRP test of M4423 passed: Time =151:54:35.000
Testing 64-bit Fermat factors...
Testing 128-bit Fermat factors...
Testing 192-bit Fermat factors...
Testing 256-bit Fermat factors...
Testing > 256-bit Fermat factors...
Testing 63-bit factors...
Testing 64-bit factors...
Testing 65-bit factors...
Testing 96-bit factors...
Factoring self-tests completed successfully.
p mod 4620 = 3727
p mod 4620 v2 = 1387
Warning: Differing (p % TF_CLASSES) values from Powering and direct-long-div! Proceeding using the 2nd result (1387).
INFO: Will write savefile t31 every 2^28 = 268435456 factor candidates tried.
INFO: No factoring savefile t31 found ... starting from scratch.
Allocated 255255 words in master template, 3537 in per-pass bit_map [960 x that in bit_atlas]
Generating difference table of first 100000 small primes
Using first 100000 odd primes; max gap = 114
max sieving prime = 1299721
Searching in the interval k=[0, 1045524480], i.e. q=[1.000000e+00, 4.490493e+18]
Each of 960 (p mod 4620) passes will consist of 1 intervals of length 226304
2949120 ones bits of 16336320 in master sieve template.
TRYQ = 4, max sieving prime = 1299721
Time to set up sieve =789:55:50.000
INFO: 960 passes to do; bit_map has 3536 64-bit words.
INFO: Doing 960 threadpool-waves of 1 pool threads each:
Pass 0:
...
Pass 217:
        Factor with k = 20269004. This factor is a probable prime.

Pass 218:
...
Pass 843:
        Factor with k = 68745. This factor is a probable prime.

Pass 844:
...
Pass 957:
Pass 958:
Pass 959:
MM(31) has 2 factors in range k = [0, 1045524480], passes 0-959
Performed 41933177 trial divides
Clocks =24431:25:25.000
A rerun with time and redirection gave
Code:
time ./Mfactor-base-2w-tfc-mt -mm 31 -bmin 1 -bmax 48 -nthread 1 >2wmm31b48.txt
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
twopmodq96_q4: Setting up for as many as 6 threads...
Apr2015 mi64_div quicktest passes.
mi64_div quicktest passes.
Searching in the interval k=[0, 1045524480], i.e. q=[1.000000e+00, 4.490493e+18]
Each of 960 (p mod 4620) passes will consist of 1 intervals of length 226304

real    1m37.129s
user    1m30.875s
sys     0m0.234s
Finding an additional "too-big" factor and increased run time are consequences of using so many classes on a small bit level.

3) Did not handle exponents > 57 bits such as MM61 (while from the bits table, I'd expect up to 114), or higher than 96 for bmax, as if it is 1word, not 2word. This was apparently a build error. The executable has been built again and tested to not have that issue.



Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1
Attached Files
File Type: gz Mfactor-base-1w.gz (245.9 KB, 77 views)
File Type: gz Mfactor-base-2w-tfc-mt.tar.gz (259.7 KB, 21 views)

Last fiddled with by kriesel on 2021-09-19 at 20:46 Reason: version info, draft build process, 2word mt errors & replace
kriesel is offline   Reply With Quote