![]() |
![]() |
#1 |
"Seth"
Apr 2019
3·89 Posts |
![]()
I'm taking a stab at ECM for a c251 (home prime 49 step 119) with t65.
it's been tested to 2*t60 so I'm starting at t65 (b1=850e6, b2=1.4e13, ~80k curves) I have 2x1080ti and a 32 core machine (2x E5-2650 v2). on cpu, stage 1 takes ~7200s/curve, stage 2 takes ~1800s/curve. on gpu, stage 1 is much faster (10x? 20x?) Ideally I would like to do a ton of stage 1 calculations on gpu and then distribute those to the cpus for stage 2. Do I just pass `-save curves_<worker> <b1> 0` to the gpu worker then `-resume curves_<worker> <b1> <b2>` to each cpu worker? |
![]() |
![]() |
![]() |
#2 |
Sep 2009
34×52 Posts |
![]()
That's about right, but you need to split the save file to give each core running stage 2 separate lines from it.
Here's part of a script I use to run ECM to 35 digits. On this system I need 3 cores to run stage 2 as fast as the GPU does stage 1, you will probably need to change that. Code:
#!/bin/bash function do_block { B1=$1 /home/chris/ecm.2741/trunk/ecm -gpu -save $NAME.save $B1 1 > $INI | tee -a $LOG wait # for the last stage 2 to finish grep -q 'Factor found' $LOG* # Check if we found a factor if (($?==0));then exit 0;fi rm $NAME.saveaa $NAME.saveab $NAME.saveac split -nl/3 $NAME.save $NAME.save rm $NAME.save (nice -n 19 /home/chris/ecm.2741/trunk/ecm -resume $NAME.saveaa $B1;/home/chris/ecm-6.4.4/ecm -c 999 -idlecmd 'ps -ef | grep -q [-]save' -n $B1 <$INI ) | tee -a $LOG.1 | grep [Ff]actor & (nice -n 19 /home/chris/ecm.2741/trunk/ecm -resume $NAME.saveab $B1;/home/chris/ecm-6.4.4/ecm -c 999 -idlecmd 'ps -ef | grep -q [-]save' -n $B1 <$INI ) | tee -a $LOG.2 | grep [Ff]actor & (nice -n 19 /home/chris/ecm.2741/trunk/ecm -resume $NAME.saveac $B1;/home/chris/ecm-6.4.4/ecm -c 999 -idlecmd 'ps -ef | grep -q [-]save' -n $B1 <$INI ) | tee -a $LOG.3 | grep [Ff]actor & } Chris |
![]() |
![]() |
![]() |
#3 |
Mar 2006
1110110012 Posts |
![]()
You should also know about the previous work that was done on this number. The last posted record is post #111 in the following thread:
https://mersenneforum.org/showthread.php?t=3238&page=11 In it you can see that I had finished: Code:
HP49 Step 119 c251: Expected number of curves to find a factor of n digits: digits 35 40 45 50 55 60 65 70 75 ------------------------------------------------------------------------------------------------------ B1 = 11e6 138 788 5208 39497 336066 3167410 3.2e+007 3.7e+008 B1 = 43e6 61 278 1459 8704 57844 419970 3346252 2.9e+007 B1 = 260e6 23 82 335 1521 7650 42057 250476 1603736 B1 = 1e9 14 44 154 599 2553 11843 59619 319570 B1 = 3e9 11 31 96 335 1279 5292 23661 112329 565999 B1 = 3e9 11 31 99 344 1315 5446 24234 115138 580561 (-maxmem 4096) Number of curves completed so far, and their equivalent t-level: 12000 @ 11e6 = 86.956x 15.228x 2.304x 0.303x 0.035x 0.003x 18000 @ 43e6 = 295.081x 64.748x 12.337x 2.068x 0.311x 0.042x 0.005x 90000 @ 260e6 = 3913.043x 1097.560x 268.656x 59.171x 11.764x 2.139x 0.359x 0.056x 120000 @ 1e9 = 8571.428x 2727.272x 779.220x 200.333x 47.003x 10.132x 2.012x 0.375x 16000 @3e9(4g)= 1454.545x 516.129x 161.616x 46.511x 12.167x 2.937x 0.660x 0.138x 19200 @ 3e9 = 1745.454x 619.354x 200.000x 57.313x 15.011x 3.628x 0.811x 0.170x ---------------------------------------------------------------------------------------------- 16066.507x 5040.291x 1424.133x 365.699x 86.291x 18.881x 3.847x 0.739x I continued working on the number for a while splitting the work as you are planning on doing. I ended up doing a total of 162,000 curves at B1=3e9, which with the t70 recommended 112,329 curves at B1=3e9 means I did a total of 1.442*t70. You may want to start at B1=3e9, or work even higher. I'm just about to finish 1000 curves at B1=10e9, which isn't much compared to the previous effort. When doing the work I would manually run the curves on the gpu's and then I would use my ecm.py script to spread all the finished stage-1 curves across multiple stage-2 jobs. For example, I had one gpu (I forget which) that could do 6400 stage-1 curves in 30 days, and another gpu that could do 9600 stage-1 curves in about 30 days. So, I could finish 16000 stage-1 curves per month. I would then use ecm.py to split those 16000 curves into 7 jobs (about 2286 curves per job) which could finish the work in 32 days. So, you may want to do the work in batches like this. The commands I used would look like: ./ecm -gpu -gpudevice 0 -gpucurves 9600 -save g0-001_9600_3e9.txt 3e9 1 < hp49_119_c251.txt ./ecm -gpu -gpudevice 1 -gpucurves 6400 -save g1-001_6400_3e9.txt 3e9 1 < hp49_119_c251.txt And then I would combine those files into a single c-001_16000_3e9.txt file and use ecm.py like so: python ecm.py -threads 7 -resume c-001_16000_3e9.txt -out r-001_16000_3e9.txt You need to make sure you have enough memory to run 7 (or more) jobs for stage-2 with B1=3e9, otherwise your stage-2 jobs will swap and slow down drastically. I believe each job took up to 14500MB, which means the total used for those jobs on that machine was about 100GB. You can find my ecm.py script in the following thread (the most recent version will be towards the end of the thread): https://mersenneforum.org/showthread.php?t=15508 |
![]() |
![]() |
![]() |
#4 |
"Curtis"
Feb 2005
Riverside, CA
473210 Posts |
![]()
This is a truly awesome amount of work! I appreciate the detail you provided of curves and t-levels. I just started the Cunningham 2_2330L factorization, which will saturate all my large-memory machines for the next season or two; I'd like to contribute some B1=1e10 curves to honor your effort, but they'll have to wait until Fall.
|
![]() |
![]() |
![]() |
#5 | |||
"Seth"
Apr 2019
26710 Posts |
![]() Quote:
Thanks for the status update! I had seen your 2014 status from http://worldofnumbers.com/topic1.htm but hadn't searched here. Quote:
Do you know of a write-up of the performance impact of using -maxmem? I did several searches but didn't find anything definitive (other than non-linear: https://www.mersenneforum.org/showpo...0&postcount=52) Quote:
Since you seem to be actively working on this, what curves would be most useful for me to work on? I'm willing to spend a 1080 GPU-month + a couple core years with high memory. |
|||
![]() |
![]() |
![]() |
#6 | ||
Mar 2006
1110110012 Posts |
![]()
I used 7 because I was able to run that many jobs on a machine with 128GB of ram. You'll need to calculate how much ram a job will take (with or without maxmem) and then figure out how many you can run in parallel.
Quote:
For example, using B1=11e6 without maxmem will look like: Code:
Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=1:2580904028 dF=32768, k=3, d=324870, d2=11, i0=23 Expected number of curves to find a factor of n digits: 35 40 45 50 55 60 65 138 788 5208 39497 336066 3167410 3.2e+007 Code:
Using B1=11000000, B2=28545931060, polynomial Dickson(12), sigma=1:3439101351 dF=8192, k=40, d=79170, d2=11, i0=128 Expected number of curves to find a factor of n digits: 35 40 45 50 55 60 65 142 813 5420 40914 348634 3290145 3.4e+007 Code:
Using B1=11000000, B2=28544268490, polynomial Dickson(12), sigma=1:2669738918 dF=16384, k=10, d=158340, d2=11, i0=59 Expected number of curves to find a factor of n digits: 35 40 45 50 55 60 65 142 813 5420 40914 348634 3290145 3.4e+007 In this case if you ran 10000 curves at B1=11e6 with no, 20, or 40 maxmem, your odds of finding various sized factors would be: Code:
Chance to find factor, of size d, = 1 - 1/e^x, where x = 10000/recommended_curves d 35 40 45 50 55 60 no ~100% 99.9996% 85.341% 22.367% 2.931% 0.315% 20 ~100% 99.9995% 84.197% 21.683% 2.827% 0.303% 40 ~100% 99.9995% 84.197% 21.683% 2.827% 0.303% Chance to miss factor, of size d, = 1/e^x, where x = 10000/recommended_curves d 35 40 45 50 55 60 no ~0% 0.000308% 14.658% 77.632% 97.068% 99.684% 20 ~0% 0.000455% 15.802% 78.316% 97.172% 99.696% 40 ~0% 0.000455% 15.802% 78.316% 97.172% 99.696% Quote:
"Active" is subjective. I haven't really worked on this since Nov 2017. I think it was the 1080 that could complete 9600 stage-1 curves with B1=3e9 in ~30 days. My computers are busy, so you'll have to verify this on your own machine. I'd recommend running with B1=3e9 or higher. |
||
![]() |
![]() |
![]() |
#7 |
"Seth"
Apr 2019
3·89 Posts |
![]()
Thanks for the information.
I did a little investigation of maxmem by myself between when I last posted and now. https://docs.google.com/spreadsheets...it?usp=sharing https://github.com/sethtroisi/misc-s...ter/ecm-maxmem The result is that it linearly slows down on stage 2 while slightly reducing B2 (about 10% on average). If ecm would use 16000 MB by default, and I set maxmem=4000MB then I expect stage 2 to take 4x as long! |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
ECM - why 3 curves? | James Heinrich | PrimeNet | 3 | 2017-11-14 13:59 |
B1 and # curves for ECM | Walter Nissen | Factoring | 36 | 2014-02-16 00:20 |
Max # Workers | sk8kidamh | Software | 5 | 2011-07-16 15:58 |
Went from 8 workers to 4 workers on v26.6 upgrade | dmoran | Software | 13 | 2011-05-23 12:36 |
Multiple systems/multiple CPUs. Best configuration? | BillW | Software | 1 | 2003-01-21 20:11 |