View Single Post
Old 2020-11-27, 22:58   #3
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

169416 Posts
Default 960-pass Windows builds

These were built using msys2 on a Windows 7 X64 Pro dual-Xeon E5645 system. They are single threaded because that's all that build approach supports. The higher pass count is somewhat more efficient in avoiding composite factor candidates.
It also allows more flexibility in number of processes run in parallel if doing that. See post 5 for batch files for running in parallel.

Differing number of words allows for fast runs on small operands, and for bigger factors and exponents. See the mfactor bits table attached to post one of this thread.

These were built for the common base of 64-bit Intel compatible cpus, not the higher SSE2, AVX, AVX2, or AVX512 flavors, so should run regardless of processor model. (Those higher processor capabilities are only supported for a subset of word lengths, as shown in the bits table attachment of post 1, but would give higher performance where supported.)

After renaming factor.c.txt to factor.c, and building the 16-pass, which already compiled some needed modules, these were built by the following:
rem large number of passes builds, for better sieving, finer pass granularity, better manycore multithreading
gcc -c -Os -DFACTOR_STANDALONE -DTRYQ=4 -DTF_CLASSES=4620 ../factor.c ../get_cpuid.c
gcc -o Mfactor-base-1w-tfc *o -lm

gcc -c -Os -DFACTOR_STANDALONE -DTRYQ=4 -DTF_CLASSES=4620 -DP2WORD ../factor.c
gcc -o Mfactor-base-2w-tfc *o -lm

gcc -c -Os -DFACTOR_STANDALONE -DTRYQ=4 -DTF_CLASSES=4620 -DP3WORD ../factor.c
gcc -o Mfactor-base-3w-tfc *o -lm

gcc -c -Os -DFACTOR_STANDALONE -DTRYQ=4 -DTF_CLASSES=4620 -DP4WORD ../factor.c
gcc -o Mfactor-base-4w-tfc *o -lm

gcc -c -Os -DFACTOR_STANDALONE -DTRYQ=4 -DTF_CLASSES=4620 -DNWORD ../factor.c
gcc -o Mfactor-base-nw-tfc *o -lm


These may be very useful for long tasks on manycore systems (dual-Xeons, Xeon Phi). However, for small tasks, they may be slower. Case in point:
On condorella dual e5645 Win 7 X64 Pro

It seems there's a considerable overhead disadvantage to many classes at small exponent and bit level
60: M(2147483647) has 3 factors in range k = [0, 68726898240], passes 0-15
Performed 2740062501 trial divides
Clocks = 00:19:47.068
Clocks = 00:19:47.068 = 1187.068 seconds.

4620: M(2147483647) has 3 factors in range k = [0, 69004615680], passes 0-959
Performed 2751128805 trial divides
Clocks = 00:23:34.701
Clocks = 00:23:34.701 = 1414.701 seconds =1.19176 times that of the 60-classes 16-passes timing

I believe based on comparing file dates and release dates these were created from source files released with Mlucas V19.0.


Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1
Attached Files
File Type: exe Mfactor-base-1w-tfc.exe (891.7 KB, 138 views)
File Type: exe Mfactor-base-2w-tfc.exe (896.8 KB, 76 views)
File Type: exe Mfactor-base-3w-tfc.exe (899.3 KB, 77 views)
File Type: exe Mfactor-base-4w-tfc.exe (900.3 KB, 76 views)
File Type: exe Mfactor-base-nw-tfc.exe (889.2 KB, 80 views)

Last fiddled with by kriesel on 2021-09-19 at 20:43 Reason: version info
kriesel is offline   Reply With Quote