mersenneforum.org ECM probabilities and bit levels
 Register FAQ Search Today's Posts Mark Forums Read

 2019-10-30, 18:27 #1 mnd9   Jun 2019 Boston, MA 1001112 Posts ECM probabilities and bit levels Hi all, I'm getting into doing more ECM work and I'm wondering -- if the expected number of curves indicated for Prime95 (e.g. 280 curves at B1=50k, 640 curves at B1=250k) are run, and no factor is found, what is the probability that we can exclude a factor in that digit range? In other words, is performing the expected number of curves, the equivalent of a "no factor found" TF result for a specific bit range? Also does anybody have a rough breakdown of what bit level each B1 level corresponds to? I understand that according to this page https://www.alpertron.com.ar/ECM.HTM that for example B1 of 50000 is the 25 digits, 250000 is 30 digits, so is it just as simple as multiplying these values by ln(10)/ln(2) or ~3.3? Thanks!
 2019-10-30, 18:53 #2 lycorn     "GIMFS" Sep 2002 Oeiras, Portugal 23·197 Posts If you do the prescribed number of curves for a certain B1 value (say 640 for B1 = 250,000), the chance that you miss a factor smaller than the corresponding number of digits (should it exist, in the first place...) is roughly 1/e ~36%. For B1 = 250,000 that number is 30 digits. So no, getting a NF-ECM result is not the same as a TF "no factor found", as ECM is probabilistic whereas TF is deterministic. As for your second question yes, it´s as simple as you stated. PS - Welcome to the ECM tribe...
2019-10-30, 19:00   #3
chalsall
If I May

"Chris Halsall"
Sep 2002

5×2,237 Posts

Quote:
 Originally Posted by lycorn PS - Welcome to the ECM tribe...
Just out of interest... How big are the ECM checkpoint files for the current (reasonable) "wavefront"?

I'm looking at offering Colab / Kaggle TF'ers the option of also doing CPU work in parallel -- some might be interested in doing ECM.

P.S. Just recovering from an unscheduled power failure... Grrr... Now have reestablish several dozen SSH sessions into servers...

2019-10-30, 21:11   #4
Dylan14

"Dylan"
Mar 2017

25516 Posts

Quote:
 Originally Posted by chalsall Just out of interest... How big are the ECM checkpoint files for the current (reasonable) "wavefront"?

Not at the current wavefront, but for the two ecm runs that I have (one for M55057 and one for M7777727) the files are 13868 and 1944532 bytes, respectively. There are two backups for each of them. I am not sure if the lengths depend on the B1 or B2 bounds, however.

2019-10-30, 21:37   #5
Happy5214

"Alexander"
Nov 2008
The Alamo City

312 Posts

Quote:
 Originally Posted by Dylan14 Not at the current wavefront, but for the two ecm runs that I have (one for M55057 and one for M7777727) the files are 13868 and 1944532 bytes, respectively. There are two backups for each of them. I am not sure if the lengths depend on the B1 or B2 bounds, however.
For both of those cases, the exponent-to-size ratio is almost exactly 4:1. To be exact, they're 3.97:1 and 3.99979:1, respectively.

Using one of my examples, I have a 51540-byte checkpoint for M205759 (B1=250k), for a ratio of 3.9922:1.

Last fiddled with by Happy5214 on 2019-10-30 at 21:40

2019-10-30, 22:55   #6
chalsall
If I May

"Chris Halsall"
Sep 2002

5·2,237 Posts

Quote:
 Originally Posted by Dylan14 Not at the current wavefront, but for the two ecm runs that I have (one for M55057 and one for M7777727) the files are 13868 and 1944532 bytes, respectively.
Hmmm... So relatively small.

You could put that over HTTP without any problem.

 2019-10-30, 23:02 #7 lycorn     "GIMFS" Sep 2002 Oeiras, Portugal 23×197 Posts The size of the save files is related to the size of the exponent under test (which in turn determives the size of the FFT used). I am currently running curves on exponents in the 29xxxx range (15k FFT) and the save files are 72Kb in size. Two save files (main and backup) are used per test, although you may choose to have two backups. If you let the server decide, it will currently assign exponents in the 13 08x xxx range (672 K FFT) and the save files will grow to 3.1 MB each.
2019-10-30, 23:24   #8
chalsall
If I May

"Chris Halsall"
Sep 2002

5×2,237 Posts

Quote:
 Originally Posted by lycorn I am currently running curves on exponents in the 29xxxx range (15k FFT) and the save files are 72Kb in size. Two save files (main and backup) are used per test, although you may choose to have two backups.
OK. And am I inferring correctly from your statement that this is "non-nominal" work?

With regards to instance design, each save file would get "thrown back" to the server when it first was "noticed". It would then be up to the server to decide how many, and for how long, they would be kept.

Is there an option to set the frequency of the save files?

Quote:
 Originally Posted by lycorn If you let the server decide, it will currently assign exponents in the 13 08x xxx range (672 K FFT) and the save files will grow to 3.1 MB each.
Again I'm inferring you mean asking Primenet for whatever it thinks is best?

Even ~3 MB isn't that big a deal -- we have already established that Colab / Kaggle instances are in no significant way bandwidth constrained.

2019-10-31, 00:28   #9
Dylan14

"Dylan"
Mar 2017

25516 Posts

Quote:
 Originally Posted by chalsall Is there an option to set the frequency of the save files?

Yes. The option to change is

Code:
DiskWriteTime=30
in prime.txt. The units of time here are minutes. The program will also save when stopped gracefully by the user (via Ctrl+C, for instance).

2019-10-31, 00:40   #10
mnd9

Jun 2019
Boston, MA

478 Posts

Quote:
 Originally Posted by lycorn If you do the prescribed number of curves for a certain B1 value (say 640 for B1 = 250,000), the chance that you miss a factor smaller than the corresponding number of digits (should it exist, in the first place...) is roughly 1/e ~36%. For B1 = 250,000 that number is 30 digits. So no, getting a NF-ECM result is not the same as a TF "no factor found", as ECM is probabilistic whereas TF is deterministic. As for your second question yes, it´s as simple as you stated. PS - Welcome to the ECM tribe...
Just another quick follow up probability question: so say an exponent has the expected B1=50k and B1=250k curves done, is the overall probability of missing a factor less than 30 digits still 1/e or a bit lower? In other words do the 50k curves contribute some additional effort independent of the 250k curves or is the 1/e estimate under the assumption that appropriate curves have been run at lower B1 values as well?

2019-10-31, 00:41   #11
chalsall
If I May

"Chris Halsall"
Sep 2002

5·2,237 Posts

Quote:
 Originally Posted by Dylan14 ...by the user (via Ctrl+C, for instance).
Or by the instance, for instance...

Thanks for the information.

 Similar Threads Thread Thread Starter Forum Replies Last Post JuanTutors Software 6 2019-07-27 03:15 NBtarheel_33 Data 6 2016-05-31 15:27 tha PrimeNet 151 2016-03-17 11:38 fivemack Aliquot Sequences 9 2012-03-16 08:49 NBtarheel_33 Math 19 2008-11-03 17:19

All times are UTC. The time now is 06:00.

Sun Mar 26 06:00:25 UTC 2023 up 220 days, 3:28, 0 users, load averages: 1.20, 1.22, 0.96