mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Linux mprime client v22.8 problem (https://www.mersenneforum.org/showthread.php?t=42)

Prime Monster 2002-08-28 18:09

Linux mprime client v22.8 problem
 
George,

I am running the above client on several nodes, including some servers (IBM DB2 7.3 latest patch level, Apache, Tomcat and others). On the DB2 server I had to remove the client as the whole machine froze up. On the Tomcat machine it slows down the response time by something like 20 milliseconds (=20%), but that is within acceptable limits, on the Apache machine there are no noticable effects. I tried to run with only one client, but same results. I also tried running factorization only and again the same results.

All the machines are running Redhat 7.3 with the latest Redhat patch levels (inc libraries). These machines have a large amount of Ram; 1.5GB up to 4GB and 2 cpus. The DB2 machine is a dual PIII Xeon with 1MB cache, the other 2 servers are dual 1GHz PIIIs.

I believe I had the same problem with earlier versions of the client, but then the machine were not in production, so a freeze didn't bother me too much. Now they are in production and I cannot have machines freezing at any time.

Is there any way I can figure what goes on with the client ( a debug log or something) ?

Prime95 2002-08-28 18:43

You're much more a Linux expert than I am. I don't know why a nice'd client would cause these problems.

I can send you an mprime with symbols in it or you can build an mprime from scratch using the source at http://www.mersenne.org/source.htm
I presume either of these might help you debug mprime.

Out of curiosity, if you set UsePrimenet=0 in prime.ini do you get the same troubles? That would tell us if we have a communication problem or an LL problem.

Prime Monster 2002-08-28 18:58

I am fairly certain it is not a communication problem, as the freezing of the machine happens fairly "slowly and gradually". The first to go is keyboard and mouse, then screen, and then after a while the communication system. If I have a remote window open on the machine and spot it happening I can save the situation by killing the mprime client and restarting it.

On average it does take up to a week for the machine to freeze.

Alf

Prime95 2002-08-28 22:01

It sounds like mprime is leaking resources (memory, file handles, sockets, etc). Do you know of any linux tools that might clue us into the problem?

Prime Monster 2002-08-28 22:51

I'll check it out, I know there are some, but I need to see if they'll work in my setup

Alf

Prime Monster 2002-08-29 09:55

George,

[quote]It sounds like mprime is leaking resources (memory, file handles, sockets, etc). Do you know of any linux tools that might clue us into the problem?[/quote]

There are several tools for Linux, as one could expect. Some of them seems to be highly specialized and some are more generic.

I would think the best thing would be to try them one by one, ie for memory leaks, file handles, network handles etc. That would make tracing over such a long period easier as well.

I'll run the traces on one of the systems if you send me an mprime with symbols in it.

Alf

daWabbit 2002-08-29 11:14

I believe the tool you want to try is called "Valgrind" and is available from Source Forge or Freshmeat.


All times are UTC. The time now is 19:03.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.