mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2020-11-22, 15:48   #1
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
U.S.A.

110100110012 Posts
Default ECM Factors

Question: Where are the majority of factors found in ECM? Stage 1 or Stage 2?
storm5510 is offline   Reply With Quote
Old 2020-11-22, 18:20   #2
xilman
Bamboozled!
 
xilman's Avatar
 
"π’‰Ίπ’ŒŒπ’‡·π’†·π’€­"
May 2003
Down not across

1019710 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Question: Where are the majority of factors found in ECM? Stage 1 or Stage 2?
It depends.


With my workload it is in stage 2.
xilman is offline   Reply With Quote
Old 2020-11-23, 15:31   #3
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
U.S.A.

3·563 Posts
Default

Quote:
Originally Posted by xilman View Post
It depends.

With my workload it is in stage 2.
Several years ago, a member here told me,"Multiple factors can be found below B1, but only one factor above." I wanted to check the validity of this statement. Apparently, this is not always the case.
storm5510 is offline   Reply With Quote
Old 2020-11-23, 16:32   #4
axn
 
axn's Avatar
 
Jun 2003

112558 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Several years ago, a member here told me,"Multiple factors can be found below B1, but only one factor above." I wanted to check the validity of this statement. Apparently, this is not always the case.
Something's lost in translation. Rather than explain why you've misunderstood that (oddly phrased) statement, can you explain to us why you think that is not always the case?
axn is online now   Reply With Quote
Old 2020-11-23, 16:44   #5
chris2be8
 
chris2be8's Avatar
 
Sep 2009

5×389 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Several years ago, a member here told me,"Multiple factors can be found below B1, but only one factor above." I wanted to check the validity of this statement. Apparently, this is not always the case.
I think it might mean that if a number has several prime factors stage 1 might find a composite factor, the product of more that 1 prime factor. But stage 2 would not find a composte factor.

But that's not true. A quick search of recent ECM work I've done reveals:
Code:
********** Factor found in step 2: 450447448630040607679056578447
Found composite factor of 30 digits: 450447448630040607679056578447
Prime cofactor 857728978884123546562041187998599433152336542089743038903 has 57 digits
Chris
chris2be8 is offline   Reply With Quote
Old 2020-11-23, 18:00   #6
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
U.S.A.

3·563 Posts
Default

Quote:
Originally Posted by axn View Post
Something's lost in translation. Rather than explain why you've misunderstood that (oddly phrased) statement, can you explain to us why you think that is not always the case?
Because of this:

Quote:
Originally Posted by xilman
With my workload, it is in stage 2.
.
"Multiple factors can be found below B1, but only one factor above..."

Oddly phrased? Perhaps the individual who wrote this statement thought I may not understand a more complex response, so he kept it simple. This was probably three years ago, at least. Who wrote it, I cannot remember.
storm5510 is offline   Reply With Quote
Old 2020-11-23, 18:39   #7
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

11·271 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Several years ago, a member here told me,"Multiple factors can be found below B1, but only one factor above." I wanted to check the validity of this statement. Apparently, this is not always the case.
I think it is a misunderstanding that comes from P-1, where the method will find a factor p if p-1 is "smooth enough": Meaning that all of p-1 factors are below B1 with at most 1 factor allowed in between B1 and B2. But this requirement of p-1 has nothing to do with the factors p that the method finds.

Besides with ECM it is the group order that has to be smooth and the group order changes with each curves sigma.

Last fiddled with by ATH on 2020-11-23 at 18:43
ATH is offline   Reply With Quote
Old 2020-11-23, 19:18   #8
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

106068 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Because of this:

.
"Multiple factors can be found below B1, but only one factor above..."

Oddly phrased? Perhaps the individual who wrote this statement thought I may not understand a more complex response, so he kept it simple. This was probably three years ago, at least. Who wrote it, I cannot remember.
What does finding more factors in stage 2 have to do with that "multiple factors" statement? Those seem completely unconnected, yet you used "because of this". So, I too fear there is a substantial set of misunderstandings here, and a quick correction of one bad assumption won't cure the real problem.
VBCurtis is offline   Reply With Quote
Old 2020-11-24, 08:47   #9
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22·7·11·29 Posts
Default

Quote:
Originally Posted by storm5510 View Post
"Multiple factors can be found below B1, but only one factor above."
That info is somehow true, haha, if you think about. But only "somehow". Of course, all former posters are right, and there is something "lost in translation" there.

Taking as example P-1, because that is easy to explain, assume there is a prime factor q that divides m=2^p-1, then we know that q is prime (yeah...) and it has the form q=2kp+1. Because it is prime, from Fermat's theorem, for any base b smaller than q, we have b^(q-1)=1 (mod q). Which, when implemented for mersenne numbers, due to the fact that q=2kp+1, we have b^(2kp)-1=0 (mod q). Which means that (b^(2p))^k-1 is a multiple of q. If we can calculate this, or some multiple of it, then the gcd between m and the result of that calculation, will reveal the factor q.

What P-1 factoring does, it computes a very large number E, as the product of a lot of small numbers (i.e. LCM(all numbers below a "boundary" B1), which means all numbers below B1 are included in this E), then it picks a small random b (usually 3 or 5), it computes first c=b^(2p), then it computes H=c^E. If k is smooth (i.e. it has nothing but very small factors) then H-1 will be a multiple of q, and we will find the factor q.

Now, E contains lots and lots of prime factors, millions (depending how high B1 is). There is a possibility that more factors of m, i.e q1, q2, etc., "fit the pattern".

Example:
Assume you try to find a factor of m=M29=2^29-1=536870911=233*1103*2089, using B1=10, B2=23.

You calculate first E=2^3*3^2*5*7=72*35=2520 (product of all primes smaller than B1=10, with their largest power not exceeding B1=10). E is a very smooth number, and it is the least common multiple of all numbers from 2 to 10, i.e. all such numbers 2, 3, 4, 5, 6, 7, 8, 9, 10, are "inside E". Then, you pick b=3, and compute c=b^(2*29)=3^58 (mod m), then H=c^E (mod m). You do the calculation mod m=M29, to avoid the numbers getting ridiculously big. Without the modular step, c would be a (about) 30 digits number, and H would be a number with over 70 thousands digits! (only for this shitty small example).

So, (al calculation is (mod m)): c=3^58=194980136, H=194980136^2520=214651018. H'=H-1=214651018-1=214651017.

If now you take gcd(H',m)=gcd(214651017,536870911) you get 486737, which is the products of 233 and 2089.

So, you found the first and the third factor, but didn't find the second. Why?

Because the "k" of the first and the third factor is 10-smooth, while the k of the second is not.

We have (the "k" is in parenthesis, in q=2kp+1 form):

233 = 2^3*29+1 = 2*(2^2)*29+1 = 2*(4)*29+1
1103 = 2*(19)*29+1
2089 = 2^3*3^2*29+1 = 2*(2^2*3^2)*29+1 = 2*(36)*29+1

All factors of "k" in the first and third case are twos and threes, i.e. smaller than 10. The second "k" is 19, which is not 10-smooth. So, P-1 can't find this factor, with this B1.
(here you may think about why P-1 does not find the factors "in order", larger factors can be found, and smaller can be missed).

Therefore, in stage 1 of the P-1, we found a multiple (composite) factor, q1*q3.

Assuming we want to continue with the P-1 stage 2 (if we didn't find any factor in stage 1, or if we just want to continue) then stage 2 works like that: we keep the result H that we computed in stage 1, we need it.

Then, for any prime r between B1 and B2 (in our example, 13, 17, 19, 23) we calculate X=H^r and take the gcd(X-1,m). Therefore, in stage 2, you can find a factor q ONLY if its particular "k" has that particular "r" in its factors.

In our case, taking gcd(H^13-1,m), and gcd(H^17-1,m), will NOT reveal any new factors. Only when we move to gcd(H^19-1, m), we will find the factor 1103. (well, this is a stupid example, because in this case the GCD is m itself, so no factor is found, I should have picked a mersenne number with 4 factors for that, but I am lazy to go back and remake all the calculus, which I did in my head, except for the large exponentiation, for which I used pari).

For mersenne numbers, probability of two factors q1 and q2 of the same m, to have respective k1 and k2 which share the same r is very VERY small, especially for large r. You can count such cases on your fingers on one hand.

In practice, things are a bit different, the algorithm makes a lot of optimizations in stage 2, it doesn't take the primes one by one, but in "groups", and it pre-computes "things" that transform the exponentiation into multiplication, that is why it is very fast, and that is why we do stage 2, otherwise, taking the prime one by one and doing exponentiations, it would be much too slow, and it would make no sense to do it. Yet, to find a multiple factor in stage 2, you will need that the factors that you find in stage 2 share some "r" in the same gcd group.

Assuming q1=2k1p+1, q2=2k2p+1, where k1,2=z1,2*r1,2, where z1,2 are some integers, and r1,2 are some primes. You will need to have q1=2z1r1p+1, q2=2z2r2p+1 with all factors of z1 and z2 being smaller than B1, and r1 and r2 being two LARGE primes, larger than B1, not necessary equal, but they must be in the same "gcd group", i.e. very close to each-other, to be found at the next gcd. This is extremely restrictive.

So, yes, your probability to find more than one factors in stage 1 is much-MUCH higher than the probability to find more than one factor in stage 2.

But about finding factors, in general, multiple or not, more factors are found in stage 2, because probabilistic, if you pick a random number, there is a higher probability that such number has one big factor and many small factors, than it is to have only small factors. Smooth numbers are "rare" and they become rarer as you go up on the numbers axis. See for example why in RSA cryptography, when a key is "weak", or easy to factor (the used primes were not "safe primes" or so), there is a higher probability it was intentionally chosen so, than that is was random picked.

Additional, we work here with B2 being 30 or 50 or 100 times larger than B1. Think about that: picking a random number, in a reasonable range (we talk here about the number "q-1") what's its probability to have all factors under B1, and what's its probability to have all-but-one factors under B1 and one factor only between B1 and 30 or 50 or 100*B1? What's its probability to have all factors but few (2, 3, etc) under B1, and the rest (2, 3, etc) between B1 and B2, but in the same "group" to be gulped at once in the gcd?

Mind that we are talking about factors that we find in stage 2, and not "factors that fulfill the condition to be found in stage 2" (stage 2 will stop at the first gcd that finds a factor, even if there are more factors there, while the stage 1 will always find ALL "factors that fulfill the stage 1 condition to be found", i.e. have k's which are B1 smmoth).

ECM works somehow similar, but it would be more complicated to explain with an example.

You also have to consider the fact that stage 2 is usually done only if stage 1 doesn't return any factor.

Last fiddled with by LaurV on 2020-11-24 at 09:22
LaurV is offline   Reply With Quote
Old 2020-11-24, 15:56   #10
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
U.S.A.

69916 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
...So, I too fear there is a substantial set of misunderstandings here, and a quick correction of one bad assumption won't cure the real problem.
I must beg forgiveness! I have clearly stepped into an area of which my understanding is far less that what I had believed.

Quote:
Originally Posted by LaurV
That info is somehow true, haha, if you think about. But only "somehow". Of course, all former posters are right, and there is something "lost in translation" there.

Taking as example P-1, because that is easy to explain, assume there is a prime factor q that divides m=2^p-1, then we know that q is prime (yeah...) and it has the form q=2kp+1. Because it is prime, from Fermat's theorem, for any base b smaller than q, we have b^(q-1)=1 (mod q). Which, when implemented for mersenne numbers, due to the fact that q=2kp+1, we have b^(2kp)-1=0 (mod q). Which means that (b^(2p))^k-1 is a multiple of q. If we can calculate this, or some multiple of it, then the gcd between m and the result of that calculation, will reveal the factor q.

What P-1 factoring does, it computes a very large number E, as the product of a lot of small numbers (i.e. LCM(all numbers below a "boundary" B1), which means all numbers below B1 are included in this E), then it picks a small random b (usually 3 or 5), it computes first c=b^(2p), then it computes H=c^E. If k is smooth (i.e. it has nothing but very small factors) then H-1 will be a multiple of q, and we will find the factor q.

Now, E contains lots and lots of prime factors, millions (depending how high B1 is). There is a possibility that more factors of m, i.e q1, q2, etc., "fit the pattern".

Example:
Assume you try to find a factor of m=M29=2^29-1=536870911=233*1103*2089, using B1=10, B2=23.

You calculate first E=2^3*3^2*5*7=72*35=2520 (product of all primes smaller than B1=10, with their largest power not exceeding B1=10). E is a very smooth number, and it is the least common multiple of all numbers from 2 to 10, i.e. all such numbers 2, 3, 4, 5, 6, 7, 8, 9, 10, are "inside E". Then, you pick b=3, and compute c=b^(2*29)=3^58 (mod m), then H=c^E (mod m). You do the calculation mod m=M29, to avoid the numbers getting ridiculously big. Without the modular step, c would be a (about) 30 digits number, and H would be a number with over 70 thousands digits! (only for this shitty small example).

So, (al calculation is (mod m)): c=3^58=194980136, H=194980136^2520=214651018. H'=H-1=214651018-1=214651017.

If now you take gcd(H',m)=gcd(214651017,536870911) you get 486737, which is the products of 233 and 2089.

So, you found the first and the third factor, but didn't find the second. Why?

Because the "k" of the first and the third factor is 10-smooth, while the k of the second is not.

We have (the "k" is in parenthesis, in q=2kp+1 form):

233 = 2^3*29+1 = 2*(2^2)*29+1 = 2*(4)*29+1
1103 = 2*(19)*29+1
2089 = 2^3*3^2*29+1 = 2*(2^2*3^2)*29+1 = 2*(36)*29+1

All factors of "k" in the first and third case are twos and threes, i.e. smaller than 10. The second "k" is 19, which is not 10-smooth. So, P-1 can't find this factor, with this B1.
(here you may think about why P-1 does not find the factors "in order", larger factors can be found, and smaller can be missed).

Therefore, in stage 1 of the P-1, we found a multiple (composite) factor, q1*q3.

Assuming we want to continue with the P-1 stage 2 (if we didn't find any factor in stage 1, or if we just want to continue) then stage 2 works like that: we keep the result H that we computed in stage 1, we need it.

Then, for any prime r between B1 and B2 (in our example, 13, 17, 19, 23) we calculate X=H^r and take the gcd(X-1,m). Therefore, in stage 2, you can find a factor q ONLY if its particular "k" has that particular "r" in its factors.

In our case, taking gcd(H^13-1,m), and gcd(H^17-1,m), will NOT reveal any new factors. Only when we move to gcd(H^19-1, m), we will find the factor 1103. (well, this is a stupid example, because in this case the GCD is m itself, so no factor is found, I should have picked a mersenne number with 4 factors for that, but I am lazy to go back and remake all the calculus, which I did in my head, except for the large exponentiation, for which I used pari).

For mersenne numbers, probability of two factors q1 and q2 of the same m, to have respective k1 and k2 which share the same r is very VERY small, especially for large r. You can count such cases on your fingers on one hand.

In practice, things are a bit different, the algorithm makes a lot of optimizations in stage 2, it doesn't take the primes one by one, but in "groups", and it pre-computes "things" that transform the exponentiation into multiplication, that is why it is very fast, and that is why we do stage 2, otherwise, taking the prime one by one and doing exponentiations, it would be much too slow, and it would make no sense to do it. Yet, to find a multiple factor in stage 2, you will need that the factors that you find in stage 2 share some "r" in the same gcd group.

Assuming q1=2k1p+1, q2=2k2p+1, where k1,2=z1,2*r1,2, where z1,2 are some integers, and r1,2 are some primes. You will need to have q1=2z1r1p+1, q2=2z2r2p+1 with all factors of z1 and z2 being smaller than B1, and r1 and r2 being two LARGE primes, larger than B1, not necessary equal, but they must be in the same "gcd group", i.e. very close to each-other, to be found at the next gcd. This is extremely restrictive.

So, yes, your probability to find more than one factors in stage 1 is much-MUCH higher than the probability to find more than one factor in stage 2.

But about finding factors, in general, multiple or not, more factors are found in stage 2, because probabilistic, if you pick a random number, there is a higher probability that such number has one big factor and many small factors, than it is to have only small factors. Smooth numbers are "rare" and they become rarer as you go up on the numbers axis. See for example why in RSA cryptography, when a key is "weak", or easy to factor (the used primes were not "safe primes" or so), there is a higher probability it was intentionally chosen so, than that is was random picked.

Additional, we work here with B2 being 30 or 50 or 100 times larger than B1. Think about that: picking a random number, in a reasonable range (we talk here about the number "q-1") what's its probability to have all factors under B1, and what's its probability to have all-but-one factors under B1 and one factor only between B1 and 30 or 50 or 100*B1? What's its probability to have all factors but few (2, 3, etc) under B1, and the rest (2, 3, etc) between B1 and B2, but in the same "group" to be gulped at once in the gcd?

Mind that we are talking about factors that we find in stage 2, and not "factors that fulfill the condition to be found in stage 2" (stage 2 will stop at the first gcd that finds a factor, even if there are more factors there, while the stage 1 will always find ALL "factors that fulfill the stage 1 condition to be found", i.e. have k's which are B1 smmoth).

ECM works somehow similar, but it would be more complicated to explain with an example.

You also have to consider the fact that stage 2 is usually done only if stage 1 doesn't return any factor.
LaurV: Thank you for taking the time write the above. I read all of it regardless of whether I understood it or not. What I highlighted in bold and blue is all I really needed to know.
storm5510 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
No factors below 2^76. What does this mean? Thanks king Information & Answers 2 2018-02-10 15:34
Known factors ATH PrimeNet 2 2014-09-04 19:45
Missing factors at the 'Known Factors' page MatWur-S530113 PrimeNet 11 2009-01-21 19:08
factors ATH Prime Cullen Prime 16 2007-07-07 13:02
I need some factors MatWur-S530113 Math 21 2007-05-12 19:36

All times are UTC. The time now is 02:42.

Sun Nov 29 02:42:28 UTC 2020 up 79 days, 23:53, 3 users, load averages: 1.33, 1.28, 1.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.