View Single Post
 2020-10-19, 13:34 #3370 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 29×167 Posts Finding factors is frequent, more than daily in my case and no doubt for some others too. Here's another case of reduced credit for a factor found: processing: TF factor 24706114790941063870961 for M114010123 (274-275) CPU credit is 12.9969 GHz-days. processing: TF no-factor for M114010289 (274-275) CPU credit is 33.5587 GHz-days. Finding a factor is the exception, not the rule. And multiple factors in a bit level is an exception within that. Prime95 behavior is sorta moot now that almost all factoring is done on gpus, but it used to look for additional factors. Some excerpts from whatsnew.txt: Code: New features in Version 28.3 of prime95.exe ------------------------------------------- 4) Information added to result lines containing "has a factor". This information may be used by the server's manual web page to give proper TF / P-1 / ECM cpu credit at a future date. New features in Version 23.9 of prime95.exe ------------------------------------------- 2) A bug in continuing after finding a factor when using AdvancedFactor was fixed. New features in Version 20.3 of prime95.exe ------------------------------------------- 3) Prime95 no longer searches for a smaller factor when trial factoring discovers a factor. The reasons are two-fold. 1) Version 19 had a bug where stopping and restarting the program bypassed the search for smaller factors. Thus, my database may already be missing smaller factors. 2) As we factor larger exponents to a deeper depth it may no longer be a quick job to determine if there are smaller factors. Note, that version 20 will still look for smaller factors if you are looking for factors below 2^60 with the FactorOverride option in undoc.txt. The FactorOverride seems to be gone from undoc.txt. The prime95 undoc.txt seems not to have any control for behavior upon finding a factor, or say what is the default behavior, stop upon factor, or finish the class or bit level or worktodo uppermost level. ECM retains an explicit control for finding additional factors. Single or multiple factors may be found by a single P-1 GCD in either stage 1 or 2. Some excerpts from undoc.txt: Code: By default, ECM will stop when a new factor is found. You can have ECM always stop or always continue searching for factors by using a value of zero or one in prime.txt: ContinueECM=n You can force the program to skip the trial factoring step prior to running a Lucas-Lehmer or PRP primality test. In prime.txt add this line: SkipTrialFactoring=1 You can tune trial factoring performance with several prime.txt settings. Probably only the first parameter below is worth tuning: MaxTFSievePrime=x (default is 155000 on AVX512 CPUs, 145000 on FMA CPUs, 1000000 otherwise) The TF code uses a small prime sieve to eliminate composite trial factors. Set x to the maximum small prime used by this sieve. MaxTFSievePrime is limited to the range of 5,000 to 100,000,000. ThreadsPerTFPool=n (default is number of hyperthreads) When multithreading, set n to the number of threads to group together in a "pool". Pooling ensures that both sieving and trial factoring will be done by threads within the same pool. Thus, if the threads share a cache, locality is increased. For example, on Knights Landing the best setting is 8 because 2 cores with 4 hyperthreads each share an L2 cache. PercentTFSieverThreads=y (default is 50) When multithreading TF, set y to be the percentage of threads in each pool that can be running the small prime sieve. NOTE: If y is set to 100, then pooling is turned off. That is, each thread sieves and then immediately TFs. While this offers perfect locality, it gives slightly worse performance and I cannot explain why. My best theory is that pooling improves usage of the instruction cache. UseMaxSieverAllocs=0, 1, 2,or 3 (default is 3) If UseMaxSieverAllocs is 1 then at least 7 sievers will be allocated, resulting in a 14% reduction in single-threaded sieving. If UseMaxSieverAllocs is 2 then at least 60 sievers will be allocated, resulting in a further 9% reduction in sieving. If UseMaxSieverAllocs is 3 then at least 720 sievers will be allocated, resulting in a further reduction of 7%. The downside is more memory is required and initialization of each of the 16 factoring passes is slower. Allocating a lot of sievers can be detremental when factoring to low bit levels. AlternateTFSieveCount=z (default is 9) On AVX2 and AVX512 cpus, the code can sieve for small primes using either traditional x86 instructions or using AVX2/AVX512 instructions. As sieve primes get larger, the x86 code is faster than the AVX2/AVX512 code. Set z to the number of blocks of small primes to use the AVX2/AVX512 code path. On my AVX2 Skylake cpu the optimal block count is 9. Using too large a value here can result in memory corruption depending on the MaxTFSievePrime setting. Last fiddled with by kriesel on 2020-10-19 at 13:50