mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2015-07-29, 19:28   #1
houding
 
houding's Avatar
 
"Adolf"
Nov 2013
South Africa

6110 Posts
Default ECM questions

I've been wondering about a few things about ECM work, and was hoping someone can help me out.

1. The difference between different runs on the same exponent at the same B1/B2 levels is the sigma value? If so, this is a random generated number?
2. If nr 1 is yes, how big does the random number go to?

Why I ask - there are a few people (myself included) that does work on F12. If I run for example 3 curves now, and then another 3 curves, I assume that the same sigma is not used again. Is there a chance the someone else might use the same sigma?

The current F12 is B1=800M, 360k curves. If there is a repeat of sigma values, then 360k different runs was not really done if some sigmas was used by same person or other people.
houding is offline   Reply With Quote
Old 2015-07-29, 20:33   #2
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

29×149 Posts
Default

A quick check of a location where sigmas are reported in addition to factors indicates sigmas range up to 4 billion or so (in a sample of ~10 factors). It is indeed chosen randomly*. So, over the course of 360k curves, there is a chance of a repeated curve or two.
VBCurtis is offline   Reply With Quote
Old 2015-07-30, 02:37   #3
axn
 
axn's Avatar
 
Jun 2003

7×11×61 Posts
Default

P95 uses 64-bit sigma. GMP-ECM uses 32-bit sigma.
/AFAIK
axn is offline   Reply With Quote
Old 2015-07-30, 04:17   #4
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22·1,789 Posts
Default

P95 uses a 53-bit sigma
Prime95 is offline   Reply With Quote
Old 2015-07-30, 17:55   #5
Antonio
 
Antonio's Avatar
 
"Antonio Key"
Sep 2011
UK

32·59 Posts
Default

The question at this point then is, how 'good' is the random number generator and it's initialization function.

I have often wondered why distributed ECM work doesn't include the sigma(s) to be used to avoid duplication of work.
For example if I give GMP ECM a sigma value and the number of curves it appears to use consecutive sigma values starting from the given value. This would appear to be ideal for distributed work.
Antonio is offline   Reply With Quote
Old 2015-07-30, 18:05   #6
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22×5×373 Posts
Default

Quote:
Originally Posted by Antonio View Post
The question at this point then is, how 'good' is the random number generator and it's initialization function.

I have often wondered why distributed ECM work doesn't include the sigma(s) to be used to avoid duplication of work.
Hint: Birthday Paradox. It simply isn't worth the effort.


Quote:
For example if I give GMP ECM a sigma value and the number of curves it appears to use consecutive sigma values starting from the given value. This would appear to be ideal for distributed work.
No. The simplest solution is merely to increase the dynamic range for sigma.
R.D. Silverman is offline   Reply With Quote
Old 2015-07-30, 22:20   #7
Antonio
 
Antonio's Avatar
 
"Antonio Key"
Sep 2011
UK

32×59 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
Hint: Birthday Paradox. It simply isn't worth the effort.




No. The simplest solution is merely to increase the dynamic range for sigma.
The question is still, how good is the random number generator and more especially how good is the initialization (seeding) function.

Consider the following scenario:
1. The pseudo-random number generator produces each possible value of sigma once and once only before repeating it's cycle.
2. The seeding function produces 1 of 365 possible seeds with equal probability.

Now if only one person performs a single job of 1,000,000 curves we do not have a problem.

If the job is divided up into units of 10 curves each - how many curves will be repeated?

Now increase the dynamic range of the pseudo-random number generator (sigma). No other changes are made.
Will this make any difference to the number of curves repeated?

My contention is that if there is insufficient entropy in the seeding function, increasing the dynamic range of sigma will not have the effect you desired.
Antonio is offline   Reply With Quote
Old 2015-07-31, 03:05   #8
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

Quote:
Originally Posted by Antonio View Post
My contention is that if there is insufficient entropy in the seeding function, increasing the dynamic range of sigma will not have the effect you desired.
I find that rather unlikely. What supporting reasoning makes you think this is plausible?
Dubslow is offline   Reply With Quote
Old 2015-07-31, 07:39   #9
Antonio
 
Antonio's Avatar
 
"Antonio Key"
Sep 2011
UK

32×59 Posts
Default

Quote:
Originally Posted by Dubslow View Post
I find that rather unlikely. What supporting reasoning makes you think this is plausible?
Think about the scenario above.
Reduce the entropy to zero, every run will start at the same point in the pseudo-random sequence and all runs of 10 curves will be identical no matter how big the sequence generator. With only 365 possible starting values it will generate only 1 of 365 possible sequences, again no matter what the length of the sequence is.

example:

sequence generator is (I+7) mod 13

sequence will be : 0, 7, 1, 8, 2, 9, 3, 10, 4, 11, 5, 12, 6, 0, 7.......

seeding function produces possible starting values of 1 or 4 so the possible sets of sigmas is

1, 8, 2, 3 , 10....
or
4, 11, 5, 12, 6.....

now increase the range of sigmas using the sequence generator (I+13) mod 23

sequence will be : 0, 13, 3, 16, 6, 19, 9, 22, 12, 2, 15, 5, 18, 8, 21, 11, 1, 14, 4, 17, 7, 20, 10, 0......

The entropy of the seeding function remains unchanged so it still only produces two possible starting values, say 1 and 16, sets of sigmas is

1, 14, 4, 17, 20.....
or
16, 6, 19, 9, 22.....

As you can see increasing the dynamic range of sigma, without increasing the entropy of the seeding function has no effect.
Once one set of curves has been run in either of the above scenario there is a 50% chance that the next set of curves will be a repeat of the first, no matter what the dynamic range of sigma.

What I am saying is that, without knowing how good the seeding function is there is no point in just increasing the dynamic range of sigma, at some point we will be limited by the entropy of the seeding function. I don't know where that point is, all I do know is that as the number of participants in a distributed effort increases the closer to that limit we will be, no matter what the dynamic range of sigma, and as we approach that point it would be better to include the sigma value(s) with the job.
Antonio is offline   Reply With Quote
Old 2015-07-31, 08:18   #10
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Yes I understood what you were saying. Perhaps I should have been more specific with my quote.

Quote:
Originally Posted by Antonio View Post
My contention is that if there is insufficient entropy in the seeding function
I find that rather unlikely. What supporting reasoning makes you think this is plausible?

No seeding in use that I can plausibly imagine as being published in GMP-ECM or Prime95 would ever have only 365 unique seeds, or even some orders of magnitude greater than that.
Dubslow is offline   Reply With Quote
Old 2015-07-31, 09:01   #11
Antonio
 
Antonio's Avatar
 
"Antonio Key"
Sep 2011
UK

10238 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Yes I understood what you were saying. Perhaps I should have been more specific with my quote.



I find that rather unlikely. What supporting reasoning makes you think this is plausible?

No seeding in use that I can plausibly imagine as being published in GMP-ECM or Prime95 would ever have only 365 unique seeds, or even some orders of magnitude greater than that.
OK, I agree with that. I was just taking my example to extremes.
Unless we know the number and distribution of the seeds, how the seeding function has been evaluated, where it gets it's entropy from etc. , there is no point in just increasing the dynamic range of sigma indefinitely. The limiting factor will always be the source of entropy used by the seeding function.
Does anyone know how 'good' this source is?
Was it designed to accommodate, for example, 1 million people each doing 1 random curve each?
I agree that it is probably 'good enough' for most scenarios, we shouldn't just assume, however, that it will always be good enough for every application.

Last fiddled with by Antonio on 2015-07-31 at 09:03
Antonio is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Two questions: Dubslow GPU Computing 1 2011-08-05 18:22
Questions about the QS Carmichael Factoring 8 2007-04-10 11:30
Questions OmbooHankvald Prime Sierpinski Project 2 2005-08-01 20:18
LLR questions OmbooHankvald Math 6 2005-06-23 11:42
A few questions :) xtreme2k Lounge 59 2002-10-31 06:20

All times are UTC. The time now is 09:19.

Fri Sep 25 09:19:52 UTC 2020 up 15 days, 6:30, 0 users, load averages: 1.46, 1.57, 1.67

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.