![]() |
![]() |
#1 |
Aug 2002
22·5·13 Posts |
![]()
George,
I am running the above client on several nodes, including some servers (IBM DB2 7.3 latest patch level, Apache, Tomcat and others). On the DB2 server I had to remove the client as the whole machine froze up. On the Tomcat machine it slows down the response time by something like 20 milliseconds (=20%), but that is within acceptable limits, on the Apache machine there are no noticable effects. I tried to run with only one client, but same results. I also tried running factorization only and again the same results. All the machines are running Redhat 7.3 with the latest Redhat patch levels (inc libraries). These machines have a large amount of Ram; 1.5GB up to 4GB and 2 cpus. The DB2 machine is a dual PIII Xeon with 1MB cache, the other 2 servers are dual 1GHz PIIIs. I believe I had the same problem with earlier versions of the client, but then the machine were not in production, so a freeze didn't bother me too much. Now they are in production and I cannot have machines freezing at any time. Is there any way I can figure what goes on with the client ( a debug log or something) ? |
![]() |
![]() |
![]() |
#2 |
P90 years forever!
Aug 2002
Yeehaw, FL
815810 Posts |
![]()
You're much more a Linux expert than I am. I don't know why a nice'd client would cause these problems.
I can send you an mprime with symbols in it or you can build an mprime from scratch using the source at http://www.mersenne.org/source.htm I presume either of these might help you debug mprime. Out of curiosity, if you set UsePrimenet=0 in prime.ini do you get the same troubles? That would tell us if we have a communication problem or an LL problem. |
![]() |
![]() |
![]() |
#3 |
Aug 2002
22×5×13 Posts |
![]()
I am fairly certain it is not a communication problem, as the freezing of the machine happens fairly "slowly and gradually". The first to go is keyboard and mouse, then screen, and then after a while the communication system. If I have a remote window open on the machine and spot it happening I can save the situation by killing the mprime client and restarting it.
On average it does take up to a week for the machine to freeze. Alf |
![]() |
![]() |
![]() |
#4 |
P90 years forever!
Aug 2002
Yeehaw, FL
2×4,079 Posts |
![]()
It sounds like mprime is leaking resources (memory, file handles, sockets, etc). Do you know of any linux tools that might clue us into the problem?
|
![]() |
![]() |
![]() |
#5 |
Aug 2002
22×5×13 Posts |
![]()
I'll check it out, I know there are some, but I need to see if they'll work in my setup
Alf |
![]() |
![]() |
![]() |
#6 | |
Aug 2002
4048 Posts |
![]()
George,
Quote:
I would think the best thing would be to try them one by one, ie for memory leaks, file handles, network handles etc. That would make tracing over such a long period easier as well. I'll run the traces on one of the systems if you send me an mprime with symbols in it. Alf |
|
![]() |
![]() |
![]() |
#7 |
Aug 2002
23 Posts |
![]()
I believe the tool you want to try is called "Valgrind" and is available from Source Forge or Freshmeat.
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Patch] CPU affinity prompt problem in mprime Linux / OS X build | Explorer09 | Software | 1 | 2017-03-01 02:34 |
PPC/Linux PRP/LLR client | BlisteringSheep | Riesel Prime Search | 1 | 2007-02-02 12:30 |
a simple question on the Linux client | nngs | Software | 1 | 2005-11-27 01:39 |
Benchmark using linux mprime client? | nngs | Software | 2 | 2005-03-08 19:01 |
linux client | stef | NFSNET Discussion | 21 | 2004-04-12 13:07 |