mersenneforum.org > Data Thinking out loud about getting under 20M unfactored exponents
 Register FAQ Search Today's Posts Mark Forums Read

 2021-11-24, 06:49 #925 De Wandelaar     "Yves" Jul 2017 Belgium 34 Posts 22.3M is quite completely finished with a lot of the work done by TheJudger. My reserved exponents will be processed within less than 2 days. @Wayne : What's the next range for TF work ? Thanks.
2021-11-24, 15:30   #926
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

498610 Posts

Quote:
 Originally Posted by SethTro With James's help (thanks for everything). I rigged up a P-1 calculator in python and tried to optimize TF vs P-1. I use a similar methodology to what Wayne does.
This is really cool.
I've been dreaming of such a "calculator" since I started....I wish I was better at programming; and math; and time management. LOL

I'll need a few days to wrap my head around all of this then I'll try to make some useful comments.

Thanks

2021-11-24, 15:33   #927
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

2×32×277 Posts

Quote:
 Originally Posted by De Wandelaar 22.3M is quite completely finished with a lot of the work done by TheJudger. My reserved exponents will be processed within less than 2 days. @Wayne : What's the next range for TF work ? Thanks.
If you are using GPU72 I think it should give you 21.9 or 21.8 or 21.7.
I don't know if Chris has to make magic or if it will just happen.

Thanks

2021-11-24, 16:04   #928
De Wandelaar

"Yves"
Jul 2017
Belgium

1218 Posts

Quote:
 Originally Posted by petrw1 If you are using GPU72 I think it should give you 21.9 or 21.8 or 21.7. I don't know if Chris has to make magic or if it will just happen. Thanks
I got 21.9 exponents.

2021-11-24, 16:51   #929
techn1ciaN

Oct 2021
U.S. / Maine

100001002 Posts

Quote:
 Originally Posted by petrw1 I have a i5-7820x with 3600 DDR4 RAM that for unknown reasons performs best with 1 Worker x 8 Cores.
What exponent did you test with? Because of the thread we're in I'm going to guess something well below the DC wavefront but still with eight digits, say 20 M. In that case you are at the "sweet spot" for your CPU where one FFT (one worker) can just fit in your L3 cache, but two or more can't (2+ workers) and get evicted to your slower DRAM. This erases the throughput gains one might typically hope to make by running more workers.

I own a Ryzen 5 3600XT, whose L3 cache is large for a consumer CPU (32 MB) and can actually fit up to DC wavefront FFTs. So I stick to one worker because I tend to run DC. I recently did an informal "benchmark" with some of my DC work (59 to 61 M exponents) and 2 workers * 3 threads indeed performed significantly worse than 1 worker * 6 threads. 3 workers * 2 threads made up some ground, but not all of it. I will have to reevaluate after the DC wavefront advances a bit (say to 70 M).

Quote:
 Originally Posted by petrw1 If P-1 is so fast now relative to PRP let it find as many factors as possible and save as many expensive PRP tests as possible. Maybe it should be 2.5 or 3 to 1 tests-saved?
Kriesel recently did some fantastic analysis that confirmed tests_saved=1 is solidly best for overall GIMPS throughput at the current wavefront with current Prime95 versions. Since the automatic bound calculator is internal to the software, I imagine Mr. Woltman will adjust it such that this continues to be the case and larger bounds do not need to be manually forced. With such a dramatic stage 2 boost, it probably will take a few releases for the accuracy to be sharpened so something like tests_saved=1.2 may turn out to be optimal in the short term (or even, say, tests_saved=0.9 if the new calculator swings too hard in the other direction).

Quote:
 Originally Posted by petrw1 Similarly it is because GPUs are SOOOO much faster at TF that we bumped the pre-PRP TF by a few bits to save PRP tests.
A nitpick: GPU factoring programs have had the effect of 3 or 4 more bit levels for almost all pre-PRP TF, but the "official" TF level where an exponent is cleared for PRP has still not changed from when factoring was done by CPU. Put differently, an exponent of 332,xxx,xxx (for example) will be TFed to at least 81 bits as long as someone from GPU72 can get to it, but if this does not happen then the server will not hold up the PRP assignment.

2021-11-24, 17:10   #930
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

2·32·277 Posts

Quote:
 Originally Posted by techn1ciaN What exponent did you test with? Because of the thread we're in I'm going to guess something well below the DC wavefront but still with eight digits, say 20 M. In that case you are at the "sweet spot" for your CPU where one FFT (one worker) can just fit in your L3 cache, but two or more can't (2+ workers) and get evicted to your slower DRAM. This erases the throughput gains one might typically hope to make by running more workers.
Interesting comment on the L3.
When I first got this new 7820x PC I was doing P-1 from 40M to 50M. 1 Worker x 8 Cores performed the best.
Yet my quads i5-3570 with the same range still perform P-1 the fastest with 4 Workers x 1 Core.
I actually haven't rechecked now that I am in the 20M to 30M range.

Quote:
 Kriesel recently did some fantastic analysis that confirmed tests_saved=1 is solidly best for overall GIMPS throughput at the current wavefront with current Prime95 versions. Since the automatic bound calculator is internal to the software, I imagine Mr. Woltman will adjust it such that this continues to be the case and larger bounds do not need to be manually forced. With such a dramatic stage 2 boost, it probably will take a few releases for the accuracy to be sharpened so something like tests_saved=1.2 may turn out to be optimal in the short term (or even, say, tests_saved=0.9 if the new calculator swings too hard in the other direction).
OK, Kreisel/George know best.

Quote:
 A nitpick: GPU factoring programs have had the effect of 3 or 4 more bit levels for almost all pre-PRP TF, but the "official" TF level where an exponent is cleared for PRP has still not changed from when factoring was done by CPU. Put differently, an exponent of 332,xxx,xxx (for example) will be TFed to at least 81 bits as long as someone from GPU72 can get to it, but if this does not happen then the server will not hold up the PRP assignment.
Agreed, but GPU72 is trying to push the bits up as much as he can.

2021-11-24, 18:03   #931
chalsall
If I May

"Chris Halsall"
Sep 2002

237068 Posts

Quote:
 Originally Posted by De Wandelaar I got 21.9 exponents.
Never send a human to do a machine's job...

George... Absolutely amazing work! Can't wait to start working with it!!!

 2021-11-24, 20:37 #932 petrw1 1976 Toyota Corona years forever!     "Wayne" Nov 2006 Saskatchewan, Canada 2·32·277 Posts I'm taking 24.7M next I expect to start Monday.
2021-11-24, 22:11   #933
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

498610 Posts

Quote:
 Originally Posted by SethTro With James's help (thanks for everything). I rigged up a P-1 calculator in python and tried to optimize TF vs P-1. I use a similiar methodology to what Wayne does. Code: [17000000,17100000] 156 needed, current TF 2155 x 72 Existing P-1 for interval greater than B1=100000, B2=2025000 => 2.0% Last P-1 2155 x 9.6% @ B1=20405627 B2=413213953 0 GPU(72) + 74390.0 CPU | GHz-Days/factor GPU: 0.0 CPU: 476.9 | 0.0x GPU/CPU Factors/Tests 0.0/0 TF + 156.0/2155 P-1 @ B1=20405627 B2=413213953 Existing P-1 for interval greater than B1=100000, B2=2025000 => 1.8% Last P-1 2155 x 8.0% @ B1=11808812 B2=239128445 60443 GPU(73) + 43049.8 CPU | GHz-Days/factor GPU: 2159.7 CPU: 336.3 | 1.4x GPU/CPU Factors/Tests 28.0/2155 TF + 128.0/2155 P-1 @ B1=11808812 B2=239128445 3314264 GPU(1605 x 78) + 0 CPU | 156/12380 TF Each block stands for completing another bitlevel of TF. Starting with no additional TF and going up to completing TF 2^77 If we do no additional TF we'll need P-1 bounds around B1=20M, B2=413M to get 156 factors from 2155 P-1 tests [/CODE]
Again this is awesome.
Sorry but the following sounds pretty disjoint ...

I've always wished I could find an Excel formula to calculate the factor ratio difference with a given B1/B2.
(Yes I use Excel ... fossil as it is)
I've tried to convert prob.php to an excel macro but it was above me.
I would love a dummy-downed version of prob.php that I could use as a function/formula in Excel ... for my purposes here it wouldn't have to be exact.

A few observations from my experience over the last 4 years
I don't think you will get 28 factors going to 73 bits.
I know the PrimeNet math suggests so but I suspect that is based on NO P-1.
With all the big factoring done recently in the 2x.xM ranges I've been seeing closer to 24 per range where there is low to moderate current P-1 done.
It would be great if these was a tool similar to prob.php that could calculate the expected TF success rate based on how much P-1 (or ECM) has been done.

To take the entire 17.0 range to 77 bits is about 3.75M GhzDays. or 2.3 years with my 2080Ti GPU.
If I were to take all exponents to the highest Bounds above and run this on my 5 home CPUs it would take me about 270 days.
So if I was tasked to complete 17.0 with the hardware I have I would do a little more TF but rely mostly on P-1.

At the individual level, I guess that is what each of us might want to do if they adopt a range:
--- Use your tool to calculate the TF vs P-1 effort as above.
--- Consider what hardware they own and use it appropriately

But at a grander scale for this project it is always a balancing act considering how much TF vs P-1 power is available.
And that has changed dramatically month to month.
I have not much of a clue what the current total capacities are for TF and P-1.

But if I did your tool is great to use to suggest/recommend TF levels for each range vs P-1 bounds.

Thanks again
Wayne

2021-11-24, 23:27   #934
gLauss

Nov 2014

3·13 Posts

Quote:
 Originally Posted by masser I've attached 38 candidates from the 5.5M range that should be safe from the TF wavefront and where additional P-1 is needed. After you complete these, we can probably find similar candidates in higher ranges. Wayne and others, is anyone working the 5.5M range? EDIT: - gLauss, be sure to allocate enough ram to get a decent B2.
Unfortunately, axn poached that list of P-1 on at least one exponent 5547953. So I will stop working on any exponents for now, please "unreserve" everything. I probably work to irregularly and am not contributing much anyway with my laptop - so I will work on whatever I like, but only on ranges where there has been no progress for at least some time and make sure to report my results timely.

2021-11-25, 01:25   #935
chalsall
If I May

"Chris Halsall"
Sep 2002

2·3·1,697 Posts

Quote:
 Originally Posted by gLauss Unfortunately, axn poached that list of P-1 on at least one exponent 5547953. So I will stop working on any exponents for now, please "unreserve" everything.
Sorry about that (although I wasn't involved).

Please know we tend to work rather fast 'round these here parts...

 Similar Threads Thread Thread Starter Forum Replies Last Post jschwar313 GPU to 72 3 2016-01-31 00:50 Batalov Factoring 6 2011-12-27 22:40 jasong jasong 1 2008-11-11 09:43 devarajkandadai Math 4 2007-07-25 03:01 WraithX GMP-ECM 1 2006-03-19 22:16

All times are UTC. The time now is 13:57.

Wed Jan 19 13:57:25 UTC 2022 up 180 days, 8:26, 0 users, load averages: 2.19, 2.17, 1.95