mersenneforum.org Running msieve polynomial selection in parallel?
 Register FAQ Search Today's Posts Mark Forums Read

 2012-06-24, 21:45 #1 ryanp     Jun 2012 Boulder, CO 1111110002 Posts Running msieve polynomial selection in parallel? Hi, I'd like to maximize what I get out of factmsieve.py (i.e. Gilchrist's guide at http://gilchrist.ca/jeff/factoring/n...ers_guide.html) by running msieve's polynomial selection on several machines (5-10) in paralell. Currently it runs "msieve -np", e.g. Code: msieve -s -l -i -np According to msieve --help, "-np" takes optional arguments: Code: -np [X,Y] perform only NFS polynomial selection; if specified, only cover leading coefficients in the range from X to Y inclusive This seems like it might be helpful for running the polynomial selection in parallel across multiple machines. The question is: for a given input N, and K machines to run in parallel, what should X and Y be for each machine? Or is there a better way to parallel polynomial selection with msieve? Thanks!
 2012-06-24, 22:05 #2 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 2×7×461 Posts I've mostly done parallel polynomial selection on a single large SMP machine, so I can use 'make'; the difficulty is that the runs for small K take longer than for large K, with the runtime being roughly proportional to K^(-0.4) and really quite variable, so blocking it all in advance is a bit tricky and unlikely to give great results. One option I've used is simply to start at K=10^6, where ranges of 1000 take consistently about an hour on a C163. So msieve -v -np 1000000,1024000 msieve -v -np 1024000,1048000 msieve -v -np 1048000,1072000 on three separate machines (or three separate directories on a machine with three cores). You can apparently run several msieve jobs in the same directory but I wouldn't - one directory per job makes life easier.
 2012-06-25, 00:35 #3 ryanp     Jun 2012 Boulder, CO 1111110002 Posts Hi fivemack, Thanks for your reply. As I'm fairly new to this, out of curiosity -- how did you arrive at your base value (10^6) and increment (24000)? Would the same values still be useful for 5, 10, or 20 machines, say? Also, perhaps I'm asking the wrong question. Essentially I'm just trying to parallelize the work of factmsieve (msieve + ggnfs) across several machines. Would I get more benefit out of running msieve on a single machine, perhaps with a smallish time limit, and then the sieving step in parallel?
 2012-06-25, 06:15 #4 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 2·7·461 Posts You're definitely asking the right question. I'm picking 1000000 as the base because that's well into the flat portion of the curve, and 24000 as the increment because that's about 24 hours of work on a C163 on the machine I have access to. You might well want to get the scripts to run 1000000-1001000, time it, and pick the increment appropriately - but remember that the intrinsic timing variance is very large. I tend to run polynomial selection for somewhere between 10% and 20% of the time I expect the sieving to take (e.g. 24 hours on 24 threads for a C16x which will take about 250 hours on 24 threads to sieve; 24 hours on 4 threads for a C150 which will take about 18 hours on 48 threads to sieve). But the precise amount of selection you do makes very little difference.
 2012-06-25, 16:42 #5 chris2be8     Sep 2009 248110 Posts You could use factMsieve.pl from http://mersenneforum.org/showthread.php?t=15662 post 13. It's designed to do polynomial selection on several systems, then automatically select the best poly and start sieving. Read the whole thread for instructions etc. Chris
2012-07-01, 06:38   #6
ryanp

Jun 2012
Boulder, CO

1111110002 Posts

Quote:
 Originally Posted by chris2be8 You could use factMsieve.pl from http://mersenneforum.org/showthread.php?t=15662 post 13. It's designed to do polynomial selection on several systems, then automatically select the best poly and start sieving. Read the whole thread for instructions etc. Chris

So, for example, suppose I run polynomial selection on a few systems and end up with .fb output files like:

Code:
N 5140013334704851669242405887740004085957839657722764142543276828381162999697124563886598292766328463730755687076646610242046892464460226559
SKEW 589153.91
R0 -348443457246036719266568520
R1 1280647128257263
A0 -12834409323851043625932493277191207
A1 180101857702596183420795022293
A2 -385348143299525459124615
A3 -563481396530542305
A4 1408784852902
A5 1000692
Which should I choose? Is it the one with the least skew, greatest skew, or something else? Sorry if this is an obvious question -- still pretty new to factoring with msieve/GNFS :-)

Thanks again!

 2012-07-01, 08:05 #7 debrouxl     Sep 2009 32×109 Posts You'll usually want to choose the polynomial whose Murphy "E" score is the highest Didn't factMsieve.pl / msieve produce *.poly files that contain a comment indicating this e value ? Anyhow, one of the ways to obtain that score from a .fb file is Code: ~/msieve/msieve -i .ini -nf .fb -nc -v Sample output: Code: \$ ~/msieve/msieve -i 6_447_minus1.ini -nf 6_447_minus1.fb -nc -v Msieve v. 1.51 (SVN 719M) Sun Jul 1 10:00:37 2012 random seeds: b32131b6 5d85b626 factoring 616206951833849099509404360836151448327434546464783674219667801836184526666386100726932863976883607096993792653424911139417437242190871728765383549637572368393220053069548102628443735964624521073 (195 digits) searching for 15-digit factors commencing number field sieve (195-digit input) R0: -808281277464764060643139600456536293376 R1: 1 A0: 36 A1: 0 A2: 0 A3: 6 A4: 0 A5: 0 A6: 1 skew 1.82, size 1.891e-11, alpha 0.773, combined = 1.017e-12 rroots = 0 commencing relation filtering with target density 70.00 estimated available RAM is 3870.3 MB error: cannot open 'msieve.dat'
2012-07-01, 16:23   #8
chris2be8

Sep 2009

3×827 Posts

Quote:
 Originally Posted by ryanp Which should I choose? Is it the one with the least skew, greatest skew, or something else?
factMsieve.pl will select the one with the best score and automatically build a .poly file from it. It reads the *.msieve.dat.p files and picks the one with the best score (the number after e in the comment).

For really large jobs it's worth doing some test sieving for the poly's with the top few scores to make sure you have the best poly, the score isn't always spot on. But just using the one with the best score works for most projects.

Chris

2012-07-13, 16:04   #9
poily

Nov 2010

2×52 Posts

I've made a simple patch for the msieve to be able to run parallel polynomial selection over MPI. Both GPU and CPU modes are supported (not simulteniously but it shouldn't be too hard to add), up to 2 GPUs per node.

There are some troubles with early termination (like by Ctrl-C) though: MPI daemons I tried wouldn't give the program enough time to properly respond to the termination signal. This basically leave you with the raw .p file as a result and you should pick the best candidate from it some other way.

The patch is just a diff in the gnfs/poly directory. I hope someone finds it useful.
Attached Files
 mpi_polysel.diff.zip (1.7 KB, 427 views)

 2019-11-16, 19:45 #10 stown   Nov 2019 1 Posts Hi, all! Please, can help me with stage completing. i running np1 stage in parallel for C173 i have several files msieve.dat.m About 5Gb each file. How can i good poly from this file? Can msieve do this job, or some external tool. thanks!

 Similar Threads Thread Thread Starter Forum Replies Last Post jacky Msieve 8 2017-09-29 13:05 drone84 Msieve 4 2017-06-28 09:18 aein Factoring 3 2017-02-25 16:42 cgy606 Msieve 16 2016-10-06 14:16 CRGreathouse Factoring 2 2009-05-25 07:55

All times are UTC. The time now is 18:32.

Thu Sep 28 18:32:39 UTC 2023 up 15 days, 16:14, 0 users, load averages: 1.14, 1.03, 1.01