mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2004-02-08, 23:38   #1
bej
 
Jan 2004

3·5 Posts
Default mprime segmentation fault on RHEL

Is anyone running mprime on Red Hat Enterprise Linux 3? On a newly installed system, I get a segmentation fault after mprime runs for a while. Initially it appeared to be crashing after Stage 1 GCD completed so I set Stage1GCD=0 in prime.ini, but now it crashes at the end of stage 1 of P-1 factoring.

I don't know much about debugging under Linux, but I tried downloading the prime95 source and building mprime with debug but got a seg fault much sooner so that was no help.

I've also tried the statically linked mprime and it fails also. Also installed Windows XP on the machine temporarily, and prime95 runs OK there, so I don't think it's a hardware problem.

Any suggestions on how to debug/resolve this problem?

Thanks,
Brian
bej is offline   Reply With Quote
Old 2004-02-10, 22:43   #2
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

48516 Posts
Default

I am just guessing here, but could memory allocation be the problem?
Before starting stage 2 P-1 a big chunk of memory has to be allocated.

What are your memory settings (DayMemory and NightMemory lines in
local.ini), and how much total (virtual) memory is available on your
machine?

When you ran Prime95 on Windows, was it the same version as the
mprime version you ran on Linux? Was there any swapping activity
when stage 2 P-1 started on Windows?

What made me think of this is that I recently upgraded from linux kernel
2.4 to 2.6 and found that my version 0 swap partition couldn't be used,
I had to reformat it as version 1 swap. Does RHE Linux 3 use kernel 2.6?
geoff is offline   Reply With Quote
Old 2004-02-11, 00:55   #3
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

22·13·157 Posts
Default

It uses 2.4... I've tested mprime and it works fine in RHAS 2 and 3...
Xyzzy is offline   Reply With Quote
Old 2004-02-11, 05:52   #4
bej
 
Jan 2004

11112 Posts
Default

It doesn't seem to be related to memory allocation (or over allocation). I had the memory limits at 32M/32M day/night, and my system has 512M real memory and a 1G swap partition. I don't notice any swapping before it crashes -- up until it crashes, my VM usage is only running at 76-77M.

I probably didn't run the exact same version of Prime95 on Windows on this machine -- I have version 23.5 of mprime and probably ran either 23.4 or 23.7 of prime95.

And as Xyzzy states, RHEL ES 3 use kernel 2.4 (2.4.21). It's good to hear that someone has run mprime OK on RHEL3... though that doesn't help me with my problem.

My other Linux system that is running OK (RH8) uses kernel 2.4.20. It's still running mprime 23.4 -- I'll have to give that a try also.

Any other suggestions? Is there anything I might be able to get out of the core file with the downloaded versions of mprime?

Thanks.
bej is offline   Reply With Quote
Old 2004-02-12, 01:40   #5
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

The only other thing I can think of is to check exactly what
the resource limits are for the user running mprime, e.g. check
that virtual memory has not been limited with 'ulimit -v' etc.

You can run gdb on mprime with 'gdb mprime core' but it
probably won't be much help without the debugging symbols,
unless you are good at following assembly. If it segfaulted in
a library function then it will at least tell you which one.

To build mprime with debugging information I think you will
need the full installation of binutils with all the cross-compiling
utilities, not usually installed by default. I haven't done this
myself, maybe someone else has?
geoff is offline   Reply With Quote
Old 2004-02-12, 06:20   #6
bej
 
Jan 2004

3·5 Posts
Default

ulimit -v is unlimited, so that shouldn't be a problem. gdb shows mprime died in free().

I tried building mprime again. If I use the .o files that ship in the sources23.zip, I seg fault immediately when I start running (the menu stuff is all OK if I run mprime -M, the seg fault comes immediately after I select Test/Continue). I downloaded and rebuilt the lastest binutils with coff support and rebuilt mprime (after a make clean so I build from the included .obj files rather than the .o files), but it still faults immediately the same way.

Anyone know how mprime should be built? Or any other suggestions on tracking down the original seg fault?

Thanks.
bej is offline   Reply With Quote
Old 2004-02-12, 19:41   #7
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

1D4316 Posts
Default

Quote:
Originally Posted by bej

Anyone know how mprime should be built? Or any other suggestions on tracking down the original seg fault?

Thanks.
Make sure the data segment for mult.o is on a 32-byte boundary. Use the right dummyXX.o file to move the data segment around.
Prime95 is offline   Reply With Quote
Old 2004-02-13, 05:09   #8
bej
 
Jan 2004

3×5 Posts
Default

Thanks. I overlooked the alignment comments in the makefile. I've now rebuilt mprime with debug information, and it's running. In fact, it has now just completed P-1 stage 1 and stage 1 GCD, and has started P-1 stage 2. It's never gotten this far before on this system.

I guess that's good, but now I'm running a version of mprime minus the security module. Will this prevent me from reporting results with primenet or otherwise affect the end results?

Any suggestions on where I should go from here?

Thanks.
bej is offline   Reply With Quote
Old 2004-02-14, 01:08   #9
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

22058 Posts
Default

Could you post a minimal local.ini, prime.ini, and worktodo.ini that
will trigger the segfault on your system? It would be good to see
if someone else can reproduce it. I am running Debian, but I'll try
it out using kernel 2.4.21.

Just to clarify your first post, does sprime segfault at the same
place as mprime? And have you been able to complete a long run
torture test on this machine?
geoff is offline   Reply With Quote
Old 2004-02-14, 08:32   #10
bej
 
Jan 2004

11112 Posts
Default

The debug version of mprime I built had gotten well into LL testing with no errors/faults so I stopped it and rebuilt mprime with optimization turned back on and debug turned off... It seg faulted at seemingly the same place (at the end of stage 1 GCD) as mprime 23.4/23.5. And yes, sprime seems to fault at the same place, but I never looked at a core from it.

The odd thing is though, the core from the mprime I built myself looks completely different from the previous cores. This one shows the fault in gwcopy() with the following backtrace:

Code:
#0  0x0808aae0 in gwcopy ()
#1  0x08265448 in ?? ()
#2  0x0806d43d in pminus1 ()
#3  0x0805e67f in pfactor ()
#4  0x08058f6f in primeContinue ()
#5  0x08071876 in linuxContinue ()
#6  0x080737ea in main_menu ()
#7  0x08071197 in main ()
I have run a long torture test OK (12+ hours) on this system. Here are my ini files (minus personal info).
Code:
* prime.ini
AskedAboutMemory=1
UsePrimenet=1
DialUp=0
DaysOfWork=1
WorkPreference=0
OutputIterations=100
ResultsFileIterations=999999999
DiskWriteTime=30
NetworkRetryTime=2
NetworkRetryTime2=240
DaysBetweenCheckins=3
TwoBackupFiles=1
SilentVictory=0

* local.ini
OldCpuType=12
OldCpuSpeed=2659
ComputerID=hermes
CPUHours=24
DayMemory=32
NightMemory=32
DayStartTime=450
DayEndTime=1410
Pid=488
LastEndDatesSent=1076563108
RollingStartTime=0
SelfTest768Passed=1
RollingAverage=999
SelfTest1024Passed=1
SelfTest8Passed=1
SelfTest10Passed=1
SelfTest896Passed=1
SelfTest12Passed=1
SelfTest14Passed=1

* worktodo.ini
Test=14010833,65,0
I've had memory at both 32/32 and 128/128 -- same fault. It always seems to fail at the beginning of P-1 stage 2. Is there a way I can bypass P-1 factoring completely maybe? I thought SkipTrialFactoring=1 sounded like it would do it, but didn't seem to have any effect.

Thanks.
bej is offline   Reply With Quote
Old 2004-02-14, 17:07   #11
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

3·11·227 Posts
Default

To skip P-1 edit your worktodo.ini and change the ",0" to ",1"
Prime95 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mprime 28.9 - occassional segmentation fault during start pessoft Software 0 2016-06-13 20:58
odd segmentation fault ChristianB YAFU 4 2015-09-09 19:38
Segmentation fault PhilF Linux 5 2006-01-07 17:12
Linux FC3 - mprime v23.9 : Segmentation fault (core dumped) nohup ./mp -d T.Rex Software 5 2005-06-22 04:22
Segmentation Fault sirius56 Software 2 2004-10-02 21:43

All times are UTC. The time now is 14:28.

Wed May 12 14:28:44 UTC 2021 up 34 days, 9:09, 0 users, load averages: 3.98, 3.31, 2.83

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.