mersenneforum.org ECM questions
 Register FAQ Search Today's Posts Mark Forums Read

 2015-07-29, 19:28 #1 houding     "Adolf" Nov 2013 South Africa 3D16 Posts ECM questions I've been wondering about a few things about ECM work, and was hoping someone can help me out. 1. The difference between different runs on the same exponent at the same B1/B2 levels is the sigma value? If so, this is a random generated number? 2. If nr 1 is yes, how big does the random number go to? Why I ask - there are a few people (myself included) that does work on F12. If I run for example 3 curves now, and then another 3 curves, I assume that the same sigma is not used again. Is there a chance the someone else might use the same sigma? The current F12 is B1=800M, 360k curves. If there is a repeat of sigma values, then 360k different runs was not really done if some sigmas was used by same person or other people.
 2015-07-29, 20:33 #2 VBCurtis     "Curtis" Feb 2005 Riverside, CA 7×617 Posts A quick check of a location where sigmas are reported in addition to factors indicates sigmas range up to 4 billion or so (in a sample of ~10 factors). It is indeed chosen randomly*. So, over the course of 360k curves, there is a chance of a repeated curve or two.
 2015-07-30, 02:37 #3 axn     Jun 2003 3×5×313 Posts P95 uses 64-bit sigma. GMP-ECM uses 32-bit sigma. /AFAIK
 2015-07-30, 04:17 #4 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 715510 Posts P95 uses a 53-bit sigma
 2015-07-30, 17:55 #5 Antonio     "Antonio Key" Sep 2011 UK 32×59 Posts The question at this point then is, how 'good' is the random number generator and it's initialization function. I have often wondered why distributed ECM work doesn't include the sigma(s) to be used to avoid duplication of work. For example if I give GMP ECM a sigma value and the number of curves it appears to use consecutive sigma values starting from the given value. This would appear to be ideal for distributed work.
2015-07-30, 18:05   #6
R.D. Silverman

Nov 2003

22·5·373 Posts

Quote:
 Originally Posted by Antonio The question at this point then is, how 'good' is the random number generator and it's initialization function. I have often wondered why distributed ECM work doesn't include the sigma(s) to be used to avoid duplication of work.
Hint: Birthday Paradox. It simply isn't worth the effort.

Quote:
 For example if I give GMP ECM a sigma value and the number of curves it appears to use consecutive sigma values starting from the given value. This would appear to be ideal for distributed work.
No. The simplest solution is merely to increase the dynamic range for sigma.

2015-07-30, 22:20   #7
Antonio

"Antonio Key"
Sep 2011
UK

21316 Posts

Quote:
 Originally Posted by R.D. Silverman Hint: Birthday Paradox. It simply isn't worth the effort. No. The simplest solution is merely to increase the dynamic range for sigma.
The question is still, how good is the random number generator and more especially how good is the initialization (seeding) function.

Consider the following scenario:
1. The pseudo-random number generator produces each possible value of sigma once and once only before repeating it's cycle.
2. The seeding function produces 1 of 365 possible seeds with equal probability.

Now if only one person performs a single job of 1,000,000 curves we do not have a problem.

If the job is divided up into units of 10 curves each - how many curves will be repeated?

Now increase the dynamic range of the pseudo-random number generator (sigma). No other changes are made.
Will this make any difference to the number of curves repeated?

My contention is that if there is insufficient entropy in the seeding function, increasing the dynamic range of sigma will not have the effect you desired.

2015-07-31, 03:05   #8
Dubslow

"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts

Quote:
 Originally Posted by Antonio My contention is that if there is insufficient entropy in the seeding function, increasing the dynamic range of sigma will not have the effect you desired.
I find that rather unlikely. What supporting reasoning makes you think this is plausible?

2015-07-31, 07:39   #9
Antonio

"Antonio Key"
Sep 2011
UK

32·59 Posts

Quote:
 Originally Posted by Dubslow I find that rather unlikely. What supporting reasoning makes you think this is plausible?
Reduce the entropy to zero, every run will start at the same point in the pseudo-random sequence and all runs of 10 curves will be identical no matter how big the sequence generator. With only 365 possible starting values it will generate only 1 of 365 possible sequences, again no matter what the length of the sequence is.

example:

sequence generator is (I+7) mod 13

sequence will be : 0, 7, 1, 8, 2, 9, 3, 10, 4, 11, 5, 12, 6, 0, 7.......

seeding function produces possible starting values of 1 or 4 so the possible sets of sigmas is

1, 8, 2, 3 , 10....
or
4, 11, 5, 12, 6.....

now increase the range of sigmas using the sequence generator (I+13) mod 23

sequence will be : 0, 13, 3, 16, 6, 19, 9, 22, 12, 2, 15, 5, 18, 8, 21, 11, 1, 14, 4, 17, 7, 20, 10, 0......

The entropy of the seeding function remains unchanged so it still only produces two possible starting values, say 1 and 16, sets of sigmas is

1, 14, 4, 17, 20.....
or
16, 6, 19, 9, 22.....

As you can see increasing the dynamic range of sigma, without increasing the entropy of the seeding function has no effect.
Once one set of curves has been run in either of the above scenario there is a 50% chance that the next set of curves will be a repeat of the first, no matter what the dynamic range of sigma.

What I am saying is that, without knowing how good the seeding function is there is no point in just increasing the dynamic range of sigma, at some point we will be limited by the entropy of the seeding function. I don't know where that point is, all I do know is that as the number of participants in a distributed effort increases the closer to that limit we will be, no matter what the dynamic range of sigma, and as we approach that point it would be better to include the sigma value(s) with the job.

2015-07-31, 08:18   #10
Dubslow

"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts

Yes I understood what you were saying. Perhaps I should have been more specific with my quote.

Quote:
 Originally Posted by Antonio My contention is that if there is insufficient entropy in the seeding function
I find that rather unlikely. What supporting reasoning makes you think this is plausible?

No seeding in use that I can plausibly imagine as being published in GMP-ECM or Prime95 would ever have only 365 unique seeds, or even some orders of magnitude greater than that.

2015-07-31, 09:01   #11
Antonio

"Antonio Key"
Sep 2011
UK

32·59 Posts

Quote:
 Originally Posted by Dubslow Yes I understood what you were saying. Perhaps I should have been more specific with my quote. I find that rather unlikely. What supporting reasoning makes you think this is plausible? No seeding in use that I can plausibly imagine as being published in GMP-ECM or Prime95 would ever have only 365 unique seeds, or even some orders of magnitude greater than that.
OK, I agree with that. I was just taking my example to extremes.
Unless we know the number and distribution of the seeds, how the seeding function has been evaluated, where it gets it's entropy from etc. , there is no point in just increasing the dynamic range of sigma indefinitely. The limiting factor will always be the source of entropy used by the seeding function.
Does anyone know how 'good' this source is?
Was it designed to accommodate, for example, 1 million people each doing 1 random curve each?
I agree that it is probably 'good enough' for most scenarios, we shouldn't just assume, however, that it will always be good enough for every application.

Last fiddled with by Antonio on 2015-07-31 at 09:03

 Similar Threads Thread Thread Starter Forum Replies Last Post Dubslow GPU Computing 1 2011-08-05 18:22 Carmichael Factoring 8 2007-04-10 11:30 OmbooHankvald Prime Sierpinski Project 2 2005-08-01 20:18 OmbooHankvald Math 6 2005-06-23 11:42 xtreme2k Lounge 59 2002-10-31 06:20

All times are UTC. The time now is 08:50.

Thu Sep 24 08:50:34 UTC 2020 up 14 days, 6:01, 0 users, load averages: 1.23, 1.72, 1.57