mersenneforum.org ECM Factors
 Register FAQ Search Today's Posts Mark Forums Read

 2020-11-22, 15:48 #1 storm5510 Random Account     Aug 2009 U.S.A. 28×7 Posts ECM Factors Question: Where are the majority of factors found in ECM? Stage 1 or Stage 2?
2020-11-22, 18:20   #2
xilman
Bamboozled!

"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

101001000001112 Posts

Quote:
 Originally Posted by storm5510 Question: Where are the majority of factors found in ECM? Stage 1 or Stage 2?
It depends.

With my workload it is in stage 2.

2020-11-23, 15:31   #3
storm5510
Random Account

Aug 2009
U.S.A.

70016 Posts

Quote:
 Originally Posted by xilman It depends. With my workload it is in stage 2.
Several years ago, a member here told me,"Multiple factors can be found below B1, but only one factor above." I wanted to check the validity of this statement. Apparently, this is not always the case.

2020-11-23, 16:32   #4
axn

Jun 2003

12EA16 Posts

Quote:
 Originally Posted by storm5510 Several years ago, a member here told me,"Multiple factors can be found below B1, but only one factor above." I wanted to check the validity of this statement. Apparently, this is not always the case.
Something's lost in translation. Rather than explain why you've misunderstood that (oddly phrased) statement, can you explain to us why you think that is not always the case?

2020-11-23, 16:44   #5
chris2be8

Sep 2009

22×32×5×11 Posts

Quote:
 Originally Posted by storm5510 Several years ago, a member here told me,"Multiple factors can be found below B1, but only one factor above." I wanted to check the validity of this statement. Apparently, this is not always the case.
I think it might mean that if a number has several prime factors stage 1 might find a composite factor, the product of more that 1 prime factor. But stage 2 would not find a composte factor.

But that's not true. A quick search of recent ECM work I've done reveals:
Code:
********** Factor found in step 2: 450447448630040607679056578447
Found composite factor of 30 digits: 450447448630040607679056578447
Prime cofactor 857728978884123546562041187998599433152336542089743038903 has 57 digits
Chris

2020-11-23, 18:00   #6
storm5510
Random Account

Aug 2009
U.S.A.

28·7 Posts

Quote:
 Originally Posted by axn Something's lost in translation. Rather than explain why you've misunderstood that (oddly phrased) statement, can you explain to us why you think that is not always the case?
Because of this:

Quote:
 Originally Posted by xilman With my workload, it is in stage 2.
.
"Multiple factors can be found below B1, but only one factor above..."

Oddly phrased? Perhaps the individual who wrote this statement thought I may not understand a more complex response, so he kept it simple. This was probably three years ago, at least. Who wrote it, I cannot remember.

2020-11-23, 18:39   #7
ATH
Einyen

Dec 2003
Denmark

300910 Posts

Quote:
 Originally Posted by storm5510 Several years ago, a member here told me,"Multiple factors can be found below B1, but only one factor above." I wanted to check the validity of this statement. Apparently, this is not always the case.
I think it is a misunderstanding that comes from P-1, where the method will find a factor p if p-1 is "smooth enough": Meaning that all of p-1 factors are below B1 with at most 1 factor allowed in between B1 and B2. But this requirement of p-1 has nothing to do with the factors p that the method finds.

Besides with ECM it is the group order that has to be smooth and the group order changes with each curves sigma.

Last fiddled with by ATH on 2020-11-23 at 18:43

2020-11-23, 19:18   #8
VBCurtis

"Curtis"
Feb 2005
Riverside, CA

11×419 Posts

Quote:
 Originally Posted by storm5510 Because of this: . "Multiple factors can be found below B1, but only one factor above..." Oddly phrased? Perhaps the individual who wrote this statement thought I may not understand a more complex response, so he kept it simple. This was probably three years ago, at least. Who wrote it, I cannot remember.
What does finding more factors in stage 2 have to do with that "multiple factors" statement? Those seem completely unconnected, yet you used "because of this". So, I too fear there is a substantial set of misunderstandings here, and a quick correction of one bad assumption won't cure the real problem.

2020-11-24, 08:47   #9
LaurV
Romulan Interpreter

Jun 2011
Thailand

3×3,049 Posts

Quote:
 Originally Posted by storm5510 "Multiple factors can be found below B1, but only one factor above."
That info is somehow true, haha, if you think about. But only "somehow". Of course, all former posters are right, and there is something "lost in translation" there.

Taking as example P-1, because that is easy to explain, assume there is a prime factor q that divides m=2^p-1, then we know that q is prime (yeah...) and it has the form q=2kp+1. Because it is prime, from Fermat's theorem, for any base b smaller than q, we have b^(q-1)=1 (mod q). Which, when implemented for mersenne numbers, due to the fact that q=2kp+1, we have b^(2kp)-1=0 (mod q). Which means that (b^(2p))^k-1 is a multiple of q. If we can calculate this, or some multiple of it, then the gcd between m and the result of that calculation, will reveal the factor q.

What P-1 factoring does, it computes a very large number E, as the product of a lot of small numbers (i.e. LCM(all numbers below a "boundary" B1), which means all numbers below B1 are included in this E), then it picks a small random b (usually 3 or 5), it computes first c=b^(2p), then it computes H=c^E. If k is smooth (i.e. it has nothing but very small factors) then H-1 will be a multiple of q, and we will find the factor q.

Now, E contains lots and lots of prime factors, millions (depending how high B1 is). There is a possibility that more factors of m, i.e q1, q2, etc., "fit the pattern".

Example:
Assume you try to find a factor of m=M29=2^29-1=536870911=233*1103*2089, using B1=10, B2=23.

You calculate first E=2^3*3^2*5*7=72*35=2520 (product of all primes smaller than B1=10, with their largest power not exceeding B1=10). E is a very smooth number, and it is the least common multiple of all numbers from 2 to 10, i.e. all such numbers 2, 3, 4, 5, 6, 7, 8, 9, 10, are "inside E". Then, you pick b=3, and compute c=b^(2*29)=3^58 (mod m), then H=c^E (mod m). You do the calculation mod m=M29, to avoid the numbers getting ridiculously big. Without the modular step, c would be a (about) 30 digits number, and H would be a number with over 70 thousands digits! (only for this shitty small example).

So, (al calculation is (mod m)): c=3^58=194980136, H=194980136^2520=214651018. H'=H-1=214651018-1=214651017.

If now you take gcd(H',m)=gcd(214651017,536870911) you get 486737, which is the products of 233 and 2089.

So, you found the first and the third factor, but didn't find the second. Why?

Because the "k" of the first and the third factor is 10-smooth, while the k of the second is not.

We have (the "k" is in parenthesis, in q=2kp+1 form):

233 = 2^3*29+1 = 2*(2^2)*29+1 = 2*(4)*29+1
1103 = 2*(19)*29+1
2089 = 2^3*3^2*29+1 = 2*(2^2*3^2)*29+1 = 2*(36)*29+1

All factors of "k" in the first and third case are twos and threes, i.e. smaller than 10. The second "k" is 19, which is not 10-smooth. So, P-1 can't find this factor, with this B1.
(here you may think about why P-1 does not find the factors "in order", larger factors can be found, and smaller can be missed).

Therefore, in stage 1 of the P-1, we found a multiple (composite) factor, q1*q3.

Assuming we want to continue with the P-1 stage 2 (if we didn't find any factor in stage 1, or if we just want to continue) then stage 2 works like that: we keep the result H that we computed in stage 1, we need it.

Then, for any prime r between B1 and B2 (in our example, 13, 17, 19, 23) we calculate X=H^r and take the gcd(X-1,m). Therefore, in stage 2, you can find a factor q ONLY if its particular "k" has that particular "r" in its factors.

In our case, taking gcd(H^13-1,m), and gcd(H^17-1,m), will NOT reveal any new factors. Only when we move to gcd(H^19-1, m), we will find the factor 1103. (well, this is a stupid example, because in this case the GCD is m itself, so no factor is found, I should have picked a mersenne number with 4 factors for that, but I am lazy to go back and remake all the calculus, which I did in my head, except for the large exponentiation, for which I used pari).

For mersenne numbers, probability of two factors q1 and q2 of the same m, to have respective k1 and k2 which share the same r is very VERY small, especially for large r. You can count such cases on your fingers on one hand.

In practice, things are a bit different, the algorithm makes a lot of optimizations in stage 2, it doesn't take the primes one by one, but in "groups", and it pre-computes "things" that transform the exponentiation into multiplication, that is why it is very fast, and that is why we do stage 2, otherwise, taking the prime one by one and doing exponentiations, it would be much too slow, and it would make no sense to do it. Yet, to find a multiple factor in stage 2, you will need that the factors that you find in stage 2 share some "r" in the same gcd group.

Assuming q1=2k1p+1, q2=2k2p+1, where k1,2=z1,2*r1,2, where z1,2 are some integers, and r1,2 are some primes. You will need to have q1=2z1r1p+1, q2=2z2r2p+1 with all factors of z1 and z2 being smaller than B1, and r1 and r2 being two LARGE primes, larger than B1, not necessary equal, but they must be in the same "gcd group", i.e. very close to each-other, to be found at the next gcd. This is extremely restrictive.

So, yes, your probability to find more than one factors in stage 1 is much-MUCH higher than the probability to find more than one factor in stage 2.

But about finding factors, in general, multiple or not, more factors are found in stage 2, because probabilistic, if you pick a random number, there is a higher probability that such number has one big factor and many small factors, than it is to have only small factors. Smooth numbers are "rare" and they become rarer as you go up on the numbers axis. See for example why in RSA cryptography, when a key is "weak", or easy to factor (the used primes were not "safe primes" or so), there is a higher probability it was intentionally chosen so, than that is was random picked.

Additional, we work here with B2 being 30 or 50 or 100 times larger than B1. Think about that: picking a random number, in a reasonable range (we talk here about the number "q-1") what's its probability to have all factors under B1, and what's its probability to have all-but-one factors under B1 and one factor only between B1 and 30 or 50 or 100*B1? What's its probability to have all factors but few (2, 3, etc) under B1, and the rest (2, 3, etc) between B1 and B2, but in the same "group" to be gulped at once in the gcd?

Mind that we are talking about factors that we find in stage 2, and not "factors that fulfill the condition to be found in stage 2" (stage 2 will stop at the first gcd that finds a factor, even if there are more factors there, while the stage 1 will always find ALL "factors that fulfill the stage 1 condition to be found", i.e. have k's which are B1 smmoth).

ECM works somehow similar, but it would be more complicated to explain with an example.

You also have to consider the fact that stage 2 is usually done only if stage 1 doesn't return any factor.

Last fiddled with by LaurV on 2020-11-24 at 09:22

2020-11-24, 15:56   #10
storm5510
Random Account

Aug 2009
U.S.A.

28×7 Posts

Quote:
 Originally Posted by VBCurtis ...So, I too fear there is a substantial set of misunderstandings here, and a quick correction of one bad assumption won't cure the real problem.
I must beg forgiveness! I have clearly stepped into an area of which my understanding is far less that what I had believed.

Quote:
LaurV: Thank you for taking the time write the above. I read all of it regardless of whether I understood it or not. What I highlighted in bold and blue is all I really needed to know.

 Similar Threads Thread Thread Starter Forum Replies Last Post king Information & Answers 2 2018-02-10 15:34 ATH PrimeNet 2 2014-09-04 19:45 MatWur-S530113 PrimeNet 11 2009-01-21 19:08 ATH Prime Cullen Prime 16 2007-07-07 13:02 MatWur-S530113 Math 21 2007-05-12 19:36

All times are UTC. The time now is 00:12.

Wed Jan 20 00:12:53 UTC 2021 up 47 days, 20:24, 0 users, load averages: 2.99, 2.73, 2.71