mersenneforum.org > Data mersenne.ca
 Register FAQ Search Today's Posts Mark Forums Read

2021-03-14, 01:11   #562
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

32×5×103 Posts

Quote:
 Originally Posted by Viliam Furik If I am interpreting the numbers correctly, that means I am about right. Is that correct? Maybe the best way to test would be to parse the list of known factors and compare the minimal work amount for TF and P-1, as there is a bias towards the TF in the factoring process because the TF is done first.
The answer in math and computers is always: "It depends".

I think the current stats being close to even are at least partly because GIMPS does a good job determining when and how to do TF vs P1.
If the TF/P1 balance gets too much out of whack this will no longer hold.

Keep in mind how much faster GPUs are at TF than CPUs or even GPUs are at P1.
My personal example: I have a 2080Ti GPU running on an i7-7820x 8-core CPU.
So if 1 TF factor is found each 200 GhzDays my GPU will find about 24 per day.
If 1 P1 factor is found each 200 GhzDays my CPU will find 1 factor each 2 days.

Remember that with each successive bit level the TF work doubles while the odds of finding a factor is almost the same; approximately 1/(bit_level+1).
However, the expected TF success rate will decrease a little if extra P1 (or ECM) work has been done; because P1/ECM will find some factors that the applied TF bit level would find.
As far as I know there does not exist a calculator here that can calculate the expected TF success rate taking into count how much P1/ECM has been done.

For the same reason, the more TF that has been done before P1, the lower the odds of P1 finding a factor. This estimates that quite well: https://www.mersenne.ca/prob.php.
The P1 work required for a desired success rate increases rapidly.
For example, using 35000001 with 74 TF bits.

2% 0.50 GhzDays
3% 1.30
4% 3.03
5% 6.36
6% 12.54

 2021-03-16, 10:34 #563 SethTro     "Seth" Apr 2019 27310 Posts I really appreciate the "ECM Summary" which gives a quick sense of the ECM work that's been complete. Would you consider 1. Adding an "ECM Summary" summary row This could be as simple as "X total GHz Days" It could also potentially include an estimated percentage of tXX completed (I'm happy to write some code if that would help out) 2. For exponents with lots of ECM rows filtering B1/B2 pairs with only a single ECM curves and GHz Days < 1.
2021-03-16, 11:56   #564
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

64408 Posts

Quote:
 Originally Posted by SethTro It could also potentially include an estimated percentage of tXX completed (I'm happy to write some code if that would help out)
While I vaguely understand the concept (e.g. "T-levels" section in YAFU docfile.txt) I have no idea how to calculate it, so if you want to provide some code that would be helpful.

 2021-03-16, 17:18 #565 Viliam Furik   "Viliam FurÃ­k" Jul 2018 Martin, Slovakia 457 Posts petrw1 -> I wasn't talking about a factor being found for the given exponent by P-1 vs. TF, but about the amount of work needed to find a given factor. I don't think these are the same things. For example, those factors which are k-smooth can be found by P-1 using less GHz-D, than by TF, as they are often of higher bit-size. If the factor is easier to find by TF than P-1, that means it's a relatively small factor with big factors of k. My "conjecture" is that the ratio of the counts of factors being easier to find by TF and factors found by P-1 is about 1:1, maybe even tends to 1:1 as exponent goes to the infinity. Probably not true, but it seems close enough.
2021-03-16, 17:46   #566
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

25·3·5·7 Posts

Quote:
 Originally Posted by SethTro 1. Adding an "ECM Summary" summary row This could be as simple as "X total GHz Days" It could also potentially include an estimated percentage of tXX completed (I'm happy to write some code if that would help out) 2. For exponents with lots of ECM rows filtering B1/B2 pairs with only a single ECM curves and GHz Days < 1.
I have implemented these two (see for example M1277).

More specifically for part 2: I'm currently displaying any B1/B2 pairs that either have a total effort of >1.0GHz-days, or at least 10 curves run at those bounds. That still leaves a number of single curves run at "oddball" bounds (but with non-trivial effort), and a number of curves of trivial effort but >10 curves run. Let me know if you think the filtering criteria should be adjusted.

When you have some sample code for calculating t-level that I can emulate, I can add that value in as well.

2021-03-16, 23:45   #567
SethTro

"Seth"
Apr 2019

3×7×13 Posts

Quote:
 Originally Posted by James Heinrich I have implemented these two (see for example M1277). More specifically for part 2: I'm currently displaying any B1/B2 pairs that either have a total effort of >1.0GHz-days, or at least 10 curves run at those bounds. That still leaves a number of single curves run at "oddball" bounds (but with non-trivial effort), and a number of curves of trivial effort but >10 curves run. Let me know if you think the filtering criteria should be adjusted. When you have some sample code for calculating t-level that I can emulate, I can add that value in as well.
I'm not sure you enabled the change in production, I'm seeing 206 rows including
"123456 654321000 2 0.002" and
"1000 10000 1 2.40618e-7"
which I believe would be filtered by your proposed change.

2021-03-16, 23:51   #568
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

25×3×5×7 Posts

Quote:
 Originally Posted by SethTro I'm not sure you enabled the change in production, I'm seeing 206 rows.
You probably have that page cached from prior to the change then. Try ctrl-F5 for a forcible refresh, there should only be 180 rows with the last one showing "various" instead of B1/B2.

2021-03-17, 00:39   #569
SethTro

"Seth"
Apr 2019

3·7·13 Posts

Quote:
 Originally Posted by James Heinrich I have implemented these two (see for example M1277). More specifically for part 2: I'm currently displaying any B1/B2 pairs that either have a total effort of >1.0GHz-days, or at least 10 curves run at those bounds. That still leaves a number of single curves run at "oddball" bounds (but with non-trivial effort), and a number of curves of trivial effort but >10 curves run. Let me know if you think the filtering criteria should be adjusted. When you have some sample code for calculating t-level that I can emulate, I can add that value in as well.
Perfect I see it now (after I reloaded without cache).

I wrote some simple starting code for estimating factoring progress, I used the numbers from ecm progress report and I imagine the UI could look similar (see attachment).

@VBCurtis couple of questions you could help out with
1. Why is it called tXX?
2. The number of expected curves to find a t50 factor with a set B1/B2 is independent of size of N correct? AKA I can calculate B1/B2 -> expected curves and use that value for all exponents?
3. Do I need to change any of the ECM math given Mersenne cofactors have a certain form?
4. Is this methodology generally sound?
Code:
# (B1, B2, expected_curves to find a t35 factor)
expected_curves = [(11e3, 11e5, 3495192), (11e3, 11e8, 374814), ..., (1e6, 100e6, 1566), (1e6, 900e6, 957), ...]
# t35 progress for 50 curves of 30e6,450e6 without info on this exact B1,B2 pair can be estimated by
min_curves_needed = min(count for b1, b2, count in expected_curves if b1 < 30e6 and b2 < 450e6)
progress = 50 / min_curves_needed
Attached Thumbnails

Last fiddled with by SethTro on 2021-03-17 at 00:39

2021-03-17, 03:14   #570
VBCurtis

"Curtis"
Feb 2005
Riverside, CA

478110 Posts

Quote:
 Originally Posted by SethTro @VBCurtis couple of questions you could help out with 1. Why is it called tXX? 2. The number of expected curves to find a t50 factor with a set B1/B2 is independent of size of N correct? AKA I can calculate B1/B2 -> expected curves and use that value for all exponents? 3. Do I need to change any of the ECM math given Mersenne cofactors have a certain form? 4. Is this methodology generally sound? Code: # (B1, B2, expected_curves to find a t35 factor) expected_curves = [(11e3, 11e5, 3495192), (11e3, 11e8, 374814), ..., (1e6, 100e6, 1566), (1e6, 900e6, 957), ...] # t35 progress for 50 curves of 30e6,450e6 without info on this exact B1,B2 pair can be estimated by min_curves_needed = min(count for b1, b2, count in expected_curves if b1 < 30e6 and b2 < 450e6) progress = 50 / min_curves_needed
I am honored to receive such a call-out!
1. T45 = one "expects" in the probability sense to find a 45-digit factor in that number of trials. 1/e of the factors of that size will be missed with a T45, but one should move up to larger bounds by this time anyway. I don't know what the T stands for- tested?
2. Correct- number of curves does not change with input size. Time for each curve does, but of course we don't care about that when estimating work done.
3. No, I don't think any of the curve-count math changes for ECM on Mersennes.
4. If you have a copy of GMP-ECM, I'd invoke that with a toy input, the B1 and B2 values, and the -v flag to get output of number of curves at that size to complete a T-level. You may be able to pull the calculation from the GMP-ECM source, actually. If grabbing the curve-count code is not tenable, I suggest building a table of a range of B1 / B2 manually. Unfortunately, GMP-ECM won't use exactly B2 = 100* B1 since it uses an entirely different stage 2, so this many not work for common P95 choices.
Hope this helps!

2021-03-17, 06:41   #571
LaurV
Romulan Interpreter

Jun 2011
Thailand

24E516 Posts

Quote:
 Originally Posted by James Heinrich However in this case, it both looks ugly and probably contains a good deal of incorrect data.
You are wrong. The reports are beautiful. We don't need some cats and squirrels walking around, numbers are enough, they are clean to read, and we can copy paste them without getting strange characters in the clipboard, etc.
Good job!

 2021-03-17, 07:04 #572 LaurV Romulan Interpreter     Jun 2011 Thailand 100100111001012 Posts re ECM calculation, I remember somebody in the past (Alex Kruppa?) did similar calculation, which was posted on the forum. Yafu source code (Ben2?) could help too (including with where the tXX terminology comes from, there was a discussion about it here too, when it was implemented in Yafu - possibly in Yafu-related threads?). Right now I am quite busy at job, but I am laughing thinking that if RDS would have been still lurking around, he would have immediately sent us to read HIS book (which, to be fair, contains all sort of things related to this calculation too). Last fiddled with by LaurV on 2021-03-17 at 07:06

 Similar Threads Thread Thread Starter Forum Replies Last Post GP2 mersenne.ca 44 2016-06-19 19:29 LaurV mersenne.ca 8 2013-11-25 21:01 siegert81 Math 2 2011-09-19 17:36 optim PrimeNet 13 2004-07-09 13:51

All times are UTC. The time now is 13:36.

Thu May 13 13:36:05 UTC 2021 up 35 days, 8:16, 1 user, load averages: 3.85, 3.45, 3.27