mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2016-04-23, 20:18   #1
potonono
 
potonono's Avatar
 
Jun 2005
USA, IL

110000012 Posts
Default P1 Questions

1) How would P1 have found this factor? http://www.mersenne.ca/exponent/75751771 Isn't it outside the bounds used?

2) If P1 eliminates a certain portion of factor search space, could trial factoring save any time by skipping that same search space?

Last fiddled with by potonono on 2016-04-23 at 20:29
potonono is offline   Reply With Quote
Old 2016-04-23, 20:41   #2
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160658 Posts
Default

1) Brent-Suyama extension *sometimes* allows P-1 to find factors where B2 is otherwise too small. The way it works is a bit luck based though, so it's not the same as "a larger B2". Note that in the "P-1 results:" section of your link, where it has the userid listed, the column to the right of B2 is labeled "e". This column indicates whether or not the Brent-Suyama extension was in use. e=1 means it was not in use, e=2 or e=6 indicates it was in use though with lesser ability to find factors, and e=12 is the full strength of Prime95's Brent-Suyama implementation. For this factor, e was indeed 12.

2) The search spaces are not compatible. There are factors that P-1 can find in 1 GHz-day that would take TF 1000, and conversely factors that TF can find in 1 GHz-day that would take 1000 with P-1. However, running one does decrease the odds that the other method will find a factor. Notice that running P-1 on the same (or nearly the same) exponents will have drastically different odds of factor listed by Prime95 depending on how much TF is done before the P-1.
Dubslow is offline   Reply With Quote
Old 2016-04-24, 06:17   #3
axn
 
axn's Avatar
 
Jun 2003

10011011010102 Posts
Default

Quote:
Originally Posted by potonono View Post
2) If P1 eliminates a certain portion of factor search space, could trial factoring save any time by skipping that same search space?
Yes. Well, maybe. There is about a 30% overlap between TF & P-1 at the levels GIMPS is doing, which is to say, you could skip about 30% of TF candidate factors if you could identify them efficiently.

The idea is to skip factors f where f-1 is smooth to a given bound (B1, B2). This can be done relatively efficiently using sieve techniques (c.f QS, NFS). I don't know if this would be a net win (especially on GPUs). But supposing that we pull this off, here are a couple of reasons not to do it.

1) We'll have to have two code paths for TF, one that skips smooth factors, and one normal. The program will have to decide which one to use based on whether P-1 has been run or not. There will also be bookkeeping overhead to keep track of which version was used.
2) What if there was an error during P-1 and a factor was missed? If we skip those factors in TF as well, we lose our chance to eliminate that test. This, I would say, is a deal breaker.
axn is offline   Reply With Quote
Old 2016-04-25, 03:12   #4
potonono
 
potonono's Avatar
 
Jun 2005
USA, IL

110000012 Posts
Default

Quote:
Originally Posted by axn View Post
1) We'll have to have two code paths for TF, one that skips smooth factors, and one normal. The program will have to decide which one to use based on whether P-1 has been run or not. There will also be bookkeeping overhead to keep track of which version was used.
If P1 were a requirement, you just need one code path. Isn't it a requirement that after a certain amount of TF, P1 is required before LL?

Quote:
Originally Posted by axn View Post
2) What if there was an error during P-1 and a factor was missed? If we skip those factors in TF as well, we lose our chance to eliminate that test. This, I would say, is a deal breaker.
What if there is an error in TF and a factor was missed? Unless double-checking is being done (for both TF and P1) with some kind of residue like the LL tests, we can't be sure, and potential 30% seems worthwhile.


All that aside, I agree something like that heavily depends on being able to identify the overlap efficiently, which more than likely wouldn't be easy... Thanks for the info Dubslow and axn!
potonono is offline   Reply With Quote
Old 2016-04-25, 04:53   #5
axn
 
axn's Avatar
 
Jun 2003

2×5×7×71 Posts
Default

Quote:
Originally Posted by potonono View Post
If P1 were a requirement, you just need one code path. Isn't it a requirement that after a certain amount of TF, P1 is required before LL?
TF run before P-1 mustn't skip the overlapping factors.

Only TF run after P-1 should skip, because now you know that such smooth factors cannot divide the number. If you skip the smooth factors before P-1 has been run, then you could end up running unnecessary TFs (at higher bit levels, when a smaller level smooth factor could've been found) and unnecessary P-1.

Hence we should have two code bases for TF. Which one will be run will depend on whether P-1 has been run or not.

Currently, all of TF is performed before P-1, so as per my scheme, we'll not do the skipping of overlapping factors. However, there is an idea of running the highest bit depth of TF after running P-1. If such a scheme were to be implemented, it could usefully skip factors for the final bit depth.

Quote:
Originally Posted by potonono View Post
What if there is an error in TF and a factor was missed? Unless double-checking is being done (for both TF and P1) with some kind of residue like the LL tests, we can't be sure, and potential 30% seems worthwhile.
It is complicated. If we have low error rate on P-1, then it might not be a big concern. But in P-1, a single error could potentially screw up the whole calculation causing us to miss a factor, whereas in TF, the impact of any error is localized. So TF should be more reliable than P-1, and the redundancy provides a little bit of extra reliability to the overall process.

Another factor to consider is that higher TF causes P-1 to pick lower bounds. But if we skip the smooth factors during TF, we cannot consider that TF level as "done" for the purpose for picking P-1 bounds, and so we'll end up spending more time on P-1. This would reduce the net savings to the project.

It is kinda complicated to model all this and see if we'll have a net win. And we don't even know if the theoretical 30% savings can be realized
axn is offline   Reply With Quote
Old 2016-04-25, 04:54   #6
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

945210 Posts
Default

Quote:
Originally Posted by potonono View Post
All that aside, I agree something like that heavily depends on being able to identify the overlap efficiently, which more than likely wouldn't be easy...
And it would not worth, as already pointed out. P1 and TF search for factors with different properties, and their overlapping segment is quite small. It would need to do P-1 before TF (and not the other way around) but right now with the GPUs the TF step that covers the same segment as P-1 is much faster, same fast as a "sieving" task should be.
LaurV is offline   Reply With Quote
Old 2016-04-25, 05:05   #7
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22·17·139 Posts
Default

Quote:
Originally Posted by axn View Post
If such a scheme were to be implemented, it could usefully skip factors for the final bit depth.
no, it could not (and sorry for the cross post) - that will need factoring k, to see which one is smooth, or in the best way, sieving, which would take the same amount as doing full TF with the GPU.

Final bit of TF after the P-1 makes sense (sometimes even the final two bits!) because a P-1 with a chance of finding factors of 3-4% (one in 25-35 trials gives a factor) will run 3 times (on a 580 or Titan) as a TF to 75 or so (with a chance to find a factor in 75 trials), therefore you can find factors faster running P-1 than running TF to last bit. Of course, this calculus is coarse, and it does not consider much the fact that if a factor is not found, then the TF to last bit will be anyhow done.

Last fiddled with by LaurV on 2016-04-25 at 05:07 Reason: s/than/then/ and back
LaurV is offline   Reply With Quote
Old 2016-04-25, 05:24   #8
axn
 
axn's Avatar
 
Jun 2003

2×5×7×71 Posts
Default

Quote:
Originally Posted by LaurV View Post
no, it could not (and sorry for the cross post) - that will need factoring k, to see which one is smooth, or in the best way, sieving, which would take the same amount as doing full TF with the GPU
It has to be done by sieving -- factoring would be too costly. Could be done on the GPU itself. But I don't understand why it would take the same (or more) time as doing full TF. Unless someone has actually tried it out, and has some hard numbers, I prefer to think of it as an open question.

Rough outline of how it works. B1 and B2 parameters control the sieving.

You sieve with all primes and prime powers < B1. Instead of clearing a bitmap entry, you'll have a byte-map, and you'll add log(p) (scaled up to be an integer) to the byte-map entry. Finally, you'll add log(B2) to that (or rather every entry in byte-map is initialized to log(B2)), and check if > log(k). All of these can be done with approximate logs.
axn is offline   Reply With Quote
Old 2016-04-25, 09:39   #9
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22×17×139 Posts
Default

MfaktX splits the k's in a number of modularity classes (either 420 or 4620, depending on settings) from which, depending on p's modularity, most are eliminated, because the factor q=2*k*p+1 of m=2^p-1, with p prime, would either be non prime, or it would be 3 or 5 (mod 8). The remaining classes (96 or respective 960, for the two cases) are sieved with a number of small primes, between few thousands and a million primes, or more, depending on settings. I mean, the q's are sieved. This way the exponentiation is only done for q's which have chances to be prime. By this, we know how fast the sieve is on GPU. What is proposed here is to repeat the same sieve for k's themselves, in each class, with primes smaller than B1 and only do exponentiation to what's left. I mean, avoid doing the exponentiation for smooth k's. Technically, the sieve computes for each small prime s the amount x=-1/(2p) (mod s) and eliminates all k which are x (mod s). This way, all q=2kp+1=0 (mod s) are eliminated [from the factor-candidates list, they can't be factors]. One additional test could check if k itself is 0 (mod s) and in this case the values of s for which such thing happens would have to be multiplied (if s<B1) in a separate variable - one for each k. At the end, it would need to test the multiplied values to see if they are equal to k (i.e if k was B1 smooth). I see two problems, one related to the memory and speed (is not cutting out bits, like normal sieving, but multiplying values, as large as 2^50 - when we go to 75 bits factors, and have a 25 bits p exponent - well, we can find their logarithms and add them instead of multiplication, but still pain in the ass, and with 50 time more memory required), and second related to powers of the small primes s, additional tests need to be done to check if k was power smooth, otherwise we will miss factors (the fact that for the P-1 algorithm to find the factor, that factor need to be B1-powersmooth, and not only smooth, is usually missed by the bystanders talking about , so skipping all smooth numbers in the TF step could miss factors).

Last fiddled with by LaurV on 2016-04-25 at 09:47
LaurV is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Two questions: Dubslow GPU Computing 1 2011-08-05 18:22
Questions about the QS Carmichael Factoring 8 2007-04-10 11:30
Questions OmbooHankvald Prime Sierpinski Project 2 2005-08-01 20:18
LLR questions OmbooHankvald Math 6 2005-06-23 11:42
A few questions :) xtreme2k Lounge 59 2002-10-31 06:20

All times are UTC. The time now is 23:46.

Tue May 18 23:46:55 UTC 2021 up 40 days, 18:27, 0 users, load averages: 1.38, 1.35, 1.44

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.