mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2020-04-05, 11:50   #1
derDaDortIst
 
Apr 2020

2×3 Posts
Default mprime hangs system completely

Hello everyone!

I want to stress-test a newly built system and have encountered a problem: When I run the stresstest at my stress-test-test system (an older notebook I use to comfortably try out my test routines before running them at the newly built system), it hangs the system completely:
  1. I boot a Ubuntu Live CD
  2. I connect it to the Internet and download the newest Linux x64 (http://www.mersenne.org/ftp_root/gim...linux64.tar.gz)
  3. I unpack it (tar xzvf p95v298b6.linux64.tar.gz)
  4. I run it (./mprime)
  5. I choose "just stress testing" and 1 Thread (blend)
--> The program starts the worker and the system immediately blocks completely, no input gets through, just the cd drive starts reading frantically. I have no way but to stop it by removing the power input.



What is happening? Is that some kind of crash? Is it a bug? Is it a hardware fault? Is it my fault??
derDaDortIst is offline   Reply With Quote
Old 2020-04-05, 13:15   #2
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,413 Posts
Default

Please tell us more. What is the CPU and motherboard, and RAM type?

The most likely cause is heat. With the live Ubuntu DVD install lm-sensors thusly

sudo apt-get update
sudo apt-get install lm-sensors

To run it enter sensors.

If the temperatures are outside what is tollerable then the most likely cause is that you did not fit the CPU properly (which requires taking off the heatsink and cleaning the surfaces with appropriate chemicals and refitting the heatsink with some new thermal compound spread over the CPU -- I recommend Arctic Silver 5),

But first run sensors without mrpime and with it,

PS: you should be running ./mprime -m not just ./mprime

Last fiddled with by paulunderwood on 2020-04-05 at 13:51
paulunderwood is offline   Reply With Quote
Old 2020-04-05, 13:52   #3
phillipsjk
 
Nov 2019

2A16 Posts
Default

Does your older notebook have less than 4GB of RAM?


You may want to try limiting how much RAM is used for testing.


AFAIK, a lot of live CD's try to use compressed swap in-RAM space. I doubt the primality testing is very compressible.


The drive activity will be trying to load program code from disk because the disk cache got evicted from RAM.
phillipsjk is offline   Reply With Quote
Old 2020-04-05, 13:54   #4
S485122
 
S485122's Avatar
 
Sep 2006
Brussels, Belgium

23×197 Posts
Default

I would add that before trying to reseat the heatsink (especially since, in this case, it would concern a notebook), I would look if there any is accumulated dust in the machine to cleanup.

Another cause of thermal problems in notebooks is the failure of the fan.

Jacob

Last fiddled with by S485122 on 2020-04-05 at 14:14 Reason: is
S485122 is online now   Reply With Quote
Old 2020-04-05, 14:21   #5
derDaDortIst
 
Apr 2020

68 Posts
Default

Quote:
Originally Posted by paulunderwood View Post
Please tell us more. What is the CPU and motherboard, and RAM type?
My test system is a Samsung E372 notebook, as I do not have any LAN cable at the workbench where I am assembling the new system I want to stress, so I test the stresstests at the notebook with its handy wireless LAN.

CPU: Intel Core i3-350M
RAM: 3GB of DDR3-1064 RAM

Quote:
The most likely cause is heat.
Really that instantanously? It stops working immediately after I've started one single thread of mprime95.

Quote:
With the live Ubuntu DVD install lm-sensors thusly

sudo apt-get install lm-sensors

To run it enter sensors.
I did this:

Code:
sudo apt-get install software-properties-common
sudo apt-add-repository universe
sudo apt-get update
 sudo apt-get install lm-sensors
sensors gives the following output:

Code:
coretemp-isa-0000
Adapter: ISA adapter
Core 0:       +34.0°C  (high = +80.0°C, crit = +90.0°C)
Core 2:       +33.0°C  (high = +80.0°C, crit = +90.0°C)

acpitz-virtual-0
Adapter: Virtual device
temp1:        +38.0°C  (crit = +89.0°C)
temp2:        +38.0°C  (crit = +89.0°C)

nouveau-pci-0200
Adapter: PCI adapter
GPU core:     +0.85 V  (min =  +0.80 V, max =  +1.03 V)
temp1:        +42.0°C  (high = +95.0°C, hyst =  +3.0°C)
                       (crit = +105.0°C, hyst =  +5.0°C)
                        (emerg = +135.0°C, hyst =  +5.0°C)
Quote:
If the temperatures are outside what is tollerable then the most likely cause is that you did not fit the CPU properly (which requires taking it off the heatsink and cleaning the surfaces with appropriate chemicals and refitting the heatsink with some new thermal compound spread over the CPU -- I recommend Arctic Silver 5),
I did not do anything with the CPU on that machine, I left it as it was delivered.

Quote:
But first run sensors without mrpime and with it,
How can I run sensors "with mprime"? Once mprime is running, I cannot launch anything anymore. Even this, launched in a second terminal:

Code:
sudo renice -n -20 $$; \
sleep 30; \
echo Helloooooo; \
killall mprime
...never gets to the echoing part. Everything's frozen, just the CD drive goes crazy.
derDaDortIst is offline   Reply With Quote
Old 2020-04-05, 14:25   #6
derDaDortIst
 
Apr 2020

1102 Posts
Default

Quote:
Originally Posted by phillipsjk View Post
Does your older notebook have less than 4GB of RAM?

You may want to try limiting how much RAM is used for testing.

AFAIK, a lot of live CD's try to use compressed swap in-RAM space. I doubt the primality testing is very compressible.

The drive activity will be trying to load program code from disk because the disk cache got evicted from RAM.
I will give that a try! It sounds very reasonable.

Quote:
Originally Posted by S485122 View Post
I would add that before trying to reseat the heatsink (especially since, in this case, it would concern a notebook), I would look if there any is accumulated dust in the machine to cleanup.

Another cause of thermal problems in notebooks is the failure of the fan.

Jacob
If I run eg. stress on the test machine and monitor it with s-tui, I can see how the temperatures rise slowly and then a fan starts working (I never run it longer than 20 to 30 seconds on the test machine, since I only want to establish the procedures for the real testing on the real device). That doesn't happen with mprime95.

Last fiddled with by derDaDortIst on 2020-04-05 at 14:33 Reason: better English =)
derDaDortIst is offline   Reply With Quote
Old 2020-04-05, 14:48   #7
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,413 Posts
Default

You could try The Ultimate Boot CD which has mprime on it.

If RAM-swap is a problem you could mkswap and swapon an attached USB pen device and presumably disable the RAM-swap.
paulunderwood is offline   Reply With Quote
Old 2020-04-05, 15:24   #8
chris2be8
 
chris2be8's Avatar
 
Sep 2009

7·271 Posts
Default

Run free -t to see how much RAM and swap space it has.

Chris
chris2be8 is offline   Reply With Quote
Old 2020-04-05, 16:08   #9
derDaDortIst
 
Apr 2020

68 Posts
Default

Quote:
Originally Posted by paulunderwood View Post
You could try The Ultimate Boot CD which has mprime on it.
I thought about using that, but I want to have a script running that monitors the temperatures constantly and kills the stressors if they go up to much. I could not find any way to do this with UBCD. Do you know any? I do not want to leave it running over night without that safety net.

Quote:
If RAM-swap is a problem you could mkswap and swapon an attached USB pen device and presumably disable the RAM-swap.
Or just restrict its memory usage, right? I am planning to run a memtest86+ for a week as well.

Quote:
Originally Posted by chris2be8 View Post
Run free -t to see how much RAM and swap space it has.

Code:
ubuntu@ubuntu:~$ free -t
              total        used        free      shared  buff/cache   available
Mem:        3899200      786896      211048      587236     2901256     2255624
Swap:             0           0           0
Total:      3899200      786896      211048
ubuntu@ubuntu:~$ free -th
              total        used        free      shared  buff/cache   available
Mem:           3.7G        767M        206M        573M        2.8G        2.2G
Swap:            0B          0B          0B
Total:         3.7G        767M        206M


When I have Firefox open as well (from downloading mprime), it's 1.7 G.
So, when I restrict the usage to 1000 M, it starts, even with all Threads, and I can stop it with Ctrl+C. That is great!!
derDaDortIst is offline   Reply With Quote
Old 2020-04-05, 16:16   #10
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,413 Posts
Default

Quote:
Originally Posted by derDaDortIst View Post
I thought about using that, but I want to have a script running that monitors the temperatures constantly and kills the stressors if they go up to much. I could not find any way to do this with UBCD. Do you know any? I do not want to leave it running over night without that safety net.

Or just restrict its memory usage, right? I am planning to run a memtest86+ for a week as well.
I don't think UBCD will run mprime and thermal monitoring at the same time.
paulunderwood is offline   Reply With Quote
Old 2020-04-05, 17:02   #11
derDaDortIst
 
Apr 2020

2×3 Posts
Default

Quote:
Originally Posted by paulunderwood View Post
I don't think UBCD will run mprime and thermal monitoring at the same time.

I've tried it and you can, errmm, I can only switch to another TTY and run sensors there and write down the values. But I do not want to sit there, doing that for 24 hours. :D
derDaDortIst is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mprime brings my system to a crawl benjamin Information & Answers 14 2018-01-18 14:23
PC hangs during Prime95 or when is CPU intensive wontstoptalking Hardware 2 2011-03-04 10:35
msieve hangs up when MPQSing a c79? Andi47 Msieve 10 2009-01-18 04:55
And now for something completely the same.... R.D. Silverman Programming 10 2005-08-17 01:45
Prime95 hangs after 1 minute. jshandorf Hardware 4 2004-09-25 03:07

All times are UTC. The time now is 16:51.

Wed Sep 30 16:51:32 UTC 2020 up 20 days, 14:02, 0 users, load averages: 1.53, 1.76, 1.77

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.