![]() |
![]() |
#1 |
"Ed Hall"
Dec 2009
Adirondack Mtns
124538 Posts |
![]()
I have been running a setup with ecmpi for quite some time now, but am currently having an issue with trying to add some more machines. I'm currently running 16 slaves, which I temporarily thought might be a limit.
What I have found, though, is that all the machines that I'm trying to add are Ubuntu 18.04 machines, while all the current running cluster is comprised of 16.04 machines. One of the 18.04 machines had been a working part of the cluster quite some time ago, prior to the 18.04 upgrade. All of the 18.04 machines have had a recent refreshing of the ecmpi program. Every machine can freely communicate with all the other machines via ssh. All the machines have the same username and directory structure, with the working directory on the host and all others sshfs mounted to that working directory (although some of my testing has shown that that may not be necessary for my setup). I have run two of the 18.04 machines as their own cluster, which is why I suspect the version difference to be the issue. I hesitate to upgrade any more until I solve this issue. Any thoughts from those who are familiar with openmpi/ecmpi? Thanks... |
![]() |
![]() |
![]() |
#2 |
"Ed Hall"
Dec 2009
Adirondack Mtns
5,419 Posts |
![]()
I know this thread is ancient, but since I never posted the solution, if it can be called that, I thought I should add the following for "closure."
The trouble turned out to be openmpi, rather than ecmpi. The repository version of openmpi wouldn't work with a --hostfile, making it quite useless for a cluster of machines. I could never get the source to compile properly, so I abandoned the use of 18.04 machines in my cluster. In my case, the use of ecmpi was actually quite inefficient due to the fact that my machines varied greatly in ability and the ecmpi results were not evaluated until all nodes returned. I stopped using ecmpi and moved to local scripts that run ecm.py. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
ecmpi with openmpi on Ubuntu? | EdH | GMP-ECM | 2 | 2020-10-05 16:25 |
How I Install and Run ecmpi Across Several Ubuntu Machines | EdH | EdH | 0 | 2019-04-04 22:33 |
ecmpi won't let me run more than two slaves... | EdH | GMP-ECM | 4 | 2018-07-07 17:13 |
Apparent aliqueit issue with specifying factors | pakaran | Aliquot Sequences | 2 | 2015-09-12 23:10 |
Troubles with Debian Netinst | ET_ | Linux | 4 | 2007-03-13 20:41 |