mersenneforum.org > YAFU SIQS Across Multiple Machines
 Register FAQ Search Today's Posts Mark Forums Read

 2022-03-22, 02:09 #1 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 22×52×47 Posts SIQS Across Multiple Machines SIQS is already multi-threaded, but is it (at least) theoretically possible to run it across more machines, possibly via openmpi? Or, is it already available and I'm just not aware of it?
2022-03-22, 09:24   #2
henryzz
Just call me Henry

"David"
Sep 2007
Liverpool (GMT/BST)

23·7·107 Posts

Quote:
 Originally Posted by EdH SIQS is already multi-threaded, but is it (at least) theoretically possible to run it across more machines, possibly via openmpi? Or, is it already available and I'm just not aware of it?
Not aware of this being available. For numbers that could vaguely benefit from this cado probably isn't a bad option.

 2022-03-22, 13:31 #3 bsquared     "Ben" Feb 2007 5·727 Posts It is not a built-in capability, but it is possible to do. (I have done it, e.g., on this C130.) "Just" start N instances of yafu on N machines in N different folders. The options -siqsT (time limit) or -siqsR (relations limit) could be used if needed. Manual gathering and post-processing tests would be required.
2022-03-22, 13:59   #4
EdH

"Ed Hall"
Dec 2009

470010 Posts

Quote:
 Originally Posted by bsquared It is not a built-in capability, but it is possible to do. (I have done it, e.g., on this C130.) "Just" start N instances of yafu on N machines in N different folders. The options -siqsT (time limit) or -siqsR (relations limit) could be used if needed. Manual gathering and post-processing tests would be required.
Thanks! I will work on some scripts or such. Are you saying, simply supplying siqs(<composite>) -siqsT <time> on each can be combined on one for the final count? Would I simply cat the relations to siqs.dat? And, would I need to stop and restart the host machine for it to find all the relations?

2022-03-22, 14:16   #5
bsquared

"Ben"
Feb 2007

5·727 Posts

Quote:
 Originally Posted by EdH Thanks! I will work on some scripts or such. Are you saying, simply supplying siqs() -siqsT
I think yes, to all. Each instance is started with random polynomials so siqs(<composite>) -siqsT <time> is sufficient. cat all relation files together. restart one process on the combined file for post processing. repeat as necessary. if you do have to repeat, you'd have to delete each remote siqs.dat first, otherwise all of them will resume and the next time you combine .dat files you'll have many duplicates. I recommend backing up the combined file if you can before post processing, just to prevent blunders.

If you are thinking of doing this on C105+ numbers for some reason then 3LP is faster, but only with parameters that are not auto-selected. Also I haven't tested 3LP on many machines; it might get tricky with the batch factoring that is used in 3LP. We can talk over strategies if this is something you're interested in.

 2022-03-22, 15:09 #6 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 470010 Posts Thanks! I'm kind of only interested in playing at this point. It may not lead to anything. I'm running ECM on a cluster with ecmpi and NFS with CADO-NFS/Msieve across the same set of machines, but I have whichever host is up, currently running SIQS <100 digits for those that fail ECM. A couple of my hosts have 40 threads and run SIQS pretty quickly, but others are a bit more limited, so I thought I'd look at distributing SIQS. The 40 thread hosts are often tied up for many hours on Msieve LA. I may get somewhere or I may abandon the idea.
 2022-03-23, 12:36 #7 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 22·52·47 Posts Hey Ben, I think I've laid out a concept for some scripts, but I'm wondering about needed relations. Is that value calculated or from a table? Maybe I'll make up another "How I. . ." thread, with a comparison between YAFU SIQS and CADO-NFS on the same machines. The tune function sets the crossover for working on a single machine, but for my current setup, I use a much lower crossover because SIQS is limited to the host machine, but CADO-NFS includes a bunch of external clients. I don't expect that to change if I distribute SIQS, but I'd like to "play" and see what results. Thanks, -Ed
2022-03-23, 12:55   #8
bsquared

"Ben"
Feb 2007

5×727 Posts

Quote:
 Originally Posted by EdH Hey Ben, I think I've laid out a concept for some scripts, but I'm wondering about needed relations. Is that value calculated or from a table? Maybe I'll make up another "How I. . ." thread, with a comparison between YAFU SIQS and CADO-NFS on the same machines. The tune function sets the crossover for working on a single machine, but for my current setup, I use a much lower crossover because SIQS is limited to the host machine, but CADO-NFS includes a bunch of external clients. I don't expect that to change if I distribute SIQS, but I'd like to "play" and see what results. Thanks, -Ed
If you are talking normal SIQS and not 3LP, then there is no table or calculation. Raw relations are added directly to a tree structure that maintains an exact count of cycles on the fly. So we stop gathering as soon as we have enough cycles. That's on one machine, where the master process has control over all relations found. For multiple machines you'll probably have to develop your own table of approximate raw relation counts to gather for a given input size.

The tune.c file in yafu has raw relation counts for its built-in numbers, as a start:

Code:
uint32_t siqs_actualrels[NUM_SIQS_PTS] = {17136, 32337,
63709, 143984, 242825, 589192, 847299, 1272852, 1709598};

double siqs_sizes[NUM_SIQS_PTS] = {60, 65, 70, 75, 80, 85, 90, 95, 100};
3LP is done differently but if your crossover is that low then 3LP will not be useful.

2022-03-23, 15:31   #9
EdH

"Ed Hall"
Dec 2009

22·52·47 Posts

Quote:
 Originally Posted by bsquared . . . 3LP is done differently but if your crossover is that low then 3LP will not be useful.
Thanks! If distributed SIQS is easy to implement and at all effective, it might move that crossover up. Then maybe 3LP might even be something to play with.

As an aside, factordb has become a poor source for composites. I tried several 70-80 digit composites from its list and they all crashed SIQS with "c not divisible by 4 in Q2(x) variation!" messages. After too much testing, I discovered it was due them all being made up of entirely small factors. I had to construct my own test composites.

 2022-03-23, 19:30 #10 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 111348 Posts I guess I've been successful! I have scripts that distribute the SIQS across four machines total and factors are returned. I started with a host that has 24 threads and added one client with 8 threads and two clients with 4 threads each. All are running YAFU 2.07. After a small amount of adjusting/balancing of -siqsR I compared a normal run on the host (24 threads) to a distributed run with all four machines (40 threads total). The single machine run for a c90 took 128 seconds, while the distributed run took 89 seconds. The relations were well above the listed requirement, so more adjustment could trim even more time. I then tried the runs with a c100. The single machine took 14:18, while the distributed run took 11:58. However, I'm sure I can adjust that down. But, I'm totally lost in the relations counts. Although I showed a combined total of 112466 relations when 120304 were needed, I got these lines on the final run: Code: 414622 rels found: 50508 full + 364114 from 3307115 partial, (1569949.62 rels/sec) . . . 414622 rels found: 50508 full + 364114 from 3307115 partial, (1569940.81 rels/sec) 414622 rels found: 50508 full + 364114 from 3307115 partial, (1565675.04 rels/sec) now at relation 500001 now at relation 1000001 The balancing is definitely going to be complicated. I'm going to have to take into account every machine that I add, because I have such a variety. But, maybe once I do some calculations, it can be set in place with somewhat permanent values for a permanent set.
2022-03-23, 20:06   #11
bsquared

"Ben"
Feb 2007

5×727 Posts

Quote:
 Originally Posted by EdH But, I'm totally lost in the relations counts. Although I showed a combined total of 112466 relations when 120304 were needed, I got these lines on the final run: Code: 414622 rels found: 50508 full + 364114 from 3307115 partial, (1569949.62 rels/sec) . . . 414622 rels found: 50508 full + 364114 from 3307115 partial, (1569940.81 rels/sec) 414622 rels found: 50508 full + 364114 from 3307115 partial, (1565675.04 rels/sec) now at relation 500001 now at relation 1000001
First off, good work! A reduction in time is great for your mix of cpus.

To hopefully shed some light on relation counts: what we really need are N cycles (120304 of them, in your case). These are formed by combining full and partial relations. When yafu says something like
A rels found: B full + C from D partial
what it means is that it has A total cycles: B of them are from full relations (no large primes) and C of them are cycles generated from partial relations (some number of large primes). D is the count of raw partial relations, and it's D that you want to split among processors. My hint above from the tune.c file shows the number of D relations you need to gather, approximately, for a c100. It looks instead like you may be trying to split A among processors.

-siqsR specifies the number of D relations (raw relations) to gather before stopping.

The (rels/sec) figure that is reported is also raw (D) relations per sec. So if you know the approximate rels/s figure for each of your computers over a range of input sizes, then you should be able to compute how long to run a job by dividing the total D relations needed by the rels/s sum of participating computers.

Last fiddled with by bsquared on 2022-03-23 at 20:18

 Similar Threads Thread Thread Starter Forum Replies Last Post cgy606 Factoring 1 2016-10-21 13:37 dans Software 5 2010-01-03 04:30 Helfire Software 8 2004-01-14 00:09 dotnet_dev Software 17 2003-06-16 14:30 BillW Software 1 2003-01-21 20:11

All times are UTC. The time now is 17:26.

Thu Aug 11 17:26:23 UTC 2022 up 35 days, 12:13, 3 users, load averages: 1.58, 1.39, 1.31

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔