20211118, 14:56  #276 
"Oliver"
Sep 2017
Porta Westfalica, DE
1111010011_{2} Posts 
Unfortunately, I have little experience in MPI, I used it in university, but there it was always set up for me and I only needed to use it.
My plan was running LA on my 5950X, which has 64 GB of RAM. If it would help, I could connect that computer to my 1950X via 10 gig LAN. I do not have Infiniband hardware. The motherboard of that system is a bit flaky as I used to run it fully loaded (128 GB, 8 slots), but now I can only run it dual channel (32 GB). In this state, it is at least stable again. So…
Last fiddled with by kruoli on 20211118 at 15:03 Reason: Word order. 
20211118, 16:43  #277  
Apr 2020
2×3^{3}×13 Posts 
Quote:
Quote:
Quote:
The 5950X has enough threads that it might be worth trying MPI even without the second machine. You can try mpirun bindto none np 2 msieve nc2 2,1 t 16 to start out; you'll need bindto when running with 2 processes, as otherwise MPI will bizarrely default to binding each process to a core! For Ed's dual Xeon, the solution ought to have been bindto socket but for some reason this didn't work as it was supposed to. Experiment with different numbers of threads and MPI processes to see what works best. 

20211118, 16:58  #278 
"Ed Hall"
Dec 2009
Adirondack Mtns
3^{3}·167 Posts 
Although I got my Infiniband working across two machines, I think that may have been after I did the MPI (openMPI) LA for a large composite.
In my case, I used it to gain some time across my two Xeon processors of my Z620 machine. It did seem to cut some time off the LA. For me, my machines mostly run Ubuntu 20.04 ATM and openMPI is easily installed on the OS. (I did discover that Ubuntu 18.04 openMPI was broken and never fixed, as far as I could tell.) I just remembered that all my info is actually in some PMs. I will dig out some of it and post it in a little while. . . 
20211118, 17:37  #279  
"Curtis"
Feb 2005
Riverside, CA
5279_{10} Posts 
Quote:
I agree that using target_density=120 is the minimum for this filtering job. My opinion (not based on enough data, I'm afraid) is that once we have enough relations to build a matrix at TD=124, we've likely gathered enough relations such that time spent on sieving will be mostly wasted when compared to time saved on matrix due to those extra relations. I'd run filtering again with TD=100 to see if a matrix builds and how much it shrinks. 87M is big. 17,000 [thread] hours! Then I'd gather relations again and filter with TD=120 when we reach 1.25G raw relations. Last fiddled with by VBCurtis on 20211118 at 17:39 

20211118, 17:38  #280 
"Ed Hall"
Dec 2009
Adirondack Mtns
1000110011101_{2} Posts 
I see I missed charybdis' post. Sorry 'bout that! But, I did find the PMs and here are some details.
I just looked over the experimentation I did for using openMPI with Msieve LA and the experiments were done with the Msieve benchmark files. Many of the openMPI tests actually added a great deal of time, but with charybdis' help, I was able to find a command set that saved some time. This was all done on a Z620 dual Xeon 6c/12t machine. The first set of tests showed the following: Code:
mpirun np 2 msieve nc2 2,1 t 6 ETA 47 h 55 m mpirun np 2 msieve nc2 2,1 t 12 ETA 48 h 22 m mpirun np 4 msieve nc2 2,2 t 3 ETA 10 h 25 m mpirun np 4 msieve nc2 2,2 t 6 ETA 8 h 29 m msieve nc2 t 12 ETA 10 h 20 m msieve nc2 t 24 ETA 9 h 27 m Code:
mpirun bindto none np 2 ./msieve nc2 2,1 t 12 ETA 7 h 34 m Last fiddled with by EdH on 20211118 at 17:41 Reason: command correction 
20211118, 18:56  #281  
Jul 2003
So Cal
2,371 Posts 
Quote:
With cores divided into chiplets on the 5950X, MPI might help. It's not NUMA, but I would still try it. On Ubuntu, getting a working MPI installed is as simple as Code:
sudo apt install openmpibin openmpidoc libopenmpidev ./msieve nc2 t 32 v mpirun np 2 ./msieve nc2 1,2 t 16 v mpirun np 4 ./msieve nc2 2,2 t 8 v 

20211118, 19:11  #282  
"Oliver"
Sep 2017
Porta Westfalica, DE
979_{10} Posts 
Thanks for all your input on MPI!
Quote:
Edit: This is bogus as long as I'm sieving on the same machine. I will do this when sieving is done. That way it will not delay sieving where others are involved. Right now? It should be finished tomorrow my time. Last fiddled with by kruoli on 20211118 at 19:31 Reason: Additions. 

20211118, 19:30  #283  
Apr 2020
2×3^{3}×13 Posts 
Quote:
Can't remember whether 2,1 vs 1,2 makes much of a difference with timings. IIRC you will need to rebuild the matrix if you change the first parameter, so at least if you test 2,1 first you can then test 2,2 with nc2 "2,2 skip_matbuild=1" and avoid having to build the matrix again. Relative speeds should be reasonably consistent between the current oversized matrix and the final one, though frmky has far more experience with this than I do. Last fiddled with by charybdis on 20211118 at 19:31 

20211118, 19:52  #284 
"Curtis"
Feb 2005
Riverside, CA
5,279 Posts 
Naw, I forgot how fast we're gathering relations. Default vs TD 100 is a mildly interesting data point for matrix size, but we won't be using either of those matrices so it's not important.
I think doing a filtering run somewhere around 1.231.28G raw relations will give us an indication of when to shut down sieving. Sieving is going quickly, and uniques ratio is good.. so I doubt more than 1.33G raw relations is needed. We agree that testing MPI is not useful while still sieving on the same machine! 
20211118, 22:22  #285  
Jul 2003
So Cal
2,371 Posts 
Quote:
You don't need to rebuild the matrix to change the first parameter. Once you build the matrix with mpi, you can use that matrix to test different parameters, both with and without mpi, using skip_matbuild=1. So, for example, run the tests in this sequence: Code:
mpirun np 2 bindto none ./msieve_mpi nc2 1,2 t 16 v mpirun np 2 bindto none ./msieve_mpi nc2 "2,1 skip_matbuild=1" t 16 v mpirun np 4 bindto none ./msieve_mpi nc2 "2,2 skip_matbuild=1" t 8 v ./msieve_nompi nc2 skip_matbuild=1 t 32 v And yes, relative speeds should be consistent across a wide range of matrix sizes. 

20211118, 23:39  #286  
Apr 2020
2·3^{3}·13 Posts 
Quote:


Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Using 16e on smaller numbers  fivemack  Factoring  3  20170919 08:52 
NFS on smaller numbers?  skan  YAFU  6  20130226 13:57 
Bernoulli(200) c204  akruppa  Factoring  114  20120820 14:01 
checking smaller number  fortega  Data  2  20050616 22:48 
Factoring Smaller Numbers  marc  Factoring  6  20041009 14:17 