mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Linux

Reply
 
Thread Tools
Old 2020-07-29, 02:52   #1
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×3,581 Posts
Default Linux crash (very strange)

Dream machine has decided to spontaneously reboot every 19 minutes!?
Yes, 19 minutes like clockwork.

It is not overheating (nor are any of the 4 CPUs that are doing a network boot)
dmesg shows nothing strange right up until crash

I am at a loss as to what to do next. Ideas?
Prime95 is online now   Reply With Quote
Old 2020-07-29, 02:58   #2
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

1101010101012 Posts
Default

Freshly installed Debian has to be coaxed into not suspending etc after ~20 minutes with a mask command. I wonder if your problem is a similar thing.

Last fiddled with by paulunderwood on 2020-07-29 at 03:16
paulunderwood is offline   Reply With Quote
Old 2020-07-29, 03:19   #3
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11011111110102 Posts
Default

Ubuntu 16.04 -- hasn't changed since the day it was built ~5 years ago.
Prime95 is online now   Reply With Quote
Old 2020-07-29, 03:31   #4
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

D5516 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Ubuntu 16.04 -- hasn't changed since the day it was built ~5 years ago.
This says run the command egrep -ir "(shut|reboot)" /var/log/*
paulunderwood is offline   Reply With Quote
Old 2020-07-29, 03:58   #5
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×3,581 Posts
Default

Since 9:00 tonight

Code:
/var/log/syslog:Jul 28 21:27:17 z170itx systemd[1]: Starting Update UTMP about System Boot/Shutdown...
/var/log/syslog:Jul 28 21:27:17 z170itx systemd[1]: Started Update UTMP about System Boot/Shutdown.
/var/log/syslog:Jul 28 21:27:17 z170itx cron[907]: (CRON) INFO (Running @reboot jobs)
/var/log/syslog:Jul 28 21:27:17 z170itx systemd[1]: Starting LXD - container startup/shutdown...
/var/log/syslog:Jul 28 21:27:18 z170itx systemd[1]: Started LXD - container startup/shutdown.
/var/log/syslog:Jul 28 21:27:23 z170itx systemd[1]: Started Unattended Upgrades Shutdown.
/var/log/syslog:Jul 28 21:47:13 z170itx systemd[1]: Starting Update UTMP about System Boot/Shutdown...
/var/log/syslog:Jul 28 21:47:13 z170itx systemd[1]: Started Update UTMP about System Boot/Shutdown.
/var/log/syslog:Jul 28 21:47:13 z170itx systemd[1]: Starting LXD - container startup/shutdown...
/var/log/syslog:Jul 28 21:47:13 z170itx cron[941]: (CRON) INFO (Running @reboot jobs)
/var/log/syslog:Jul 28 21:47:13 z170itx systemd[1]: Started LXD - container startup/shutdown.
/var/log/syslog:Jul 28 21:47:17 z170itx systemd[1]: Started Unattended Upgrades Shutdown.
/var/log/syslog:Jul 28 22:07:12 z170itx systemd[1]: Starting Update UTMP about System Boot/Shutdown...
/var/log/syslog:Jul 28 22:07:12 z170itx systemd[1]: Started Update UTMP about System Boot/Shutdown.
/var/log/syslog:Jul 28 22:07:12 z170itx systemd[1]: Starting LXD - container startup/shutdown...
/var/log/syslog:Jul 28 22:07:12 z170itx cron[927]: (CRON) INFO (Running @reboot jobs)
/var/log/syslog:Jul 28 22:07:13 z170itx systemd[1]: Started LXD - container startup/shutdown.
/var/log/syslog:Jul 28 22:07:18 z170itx systemd[1]: Started Unattended Upgrades Shutdown.
/var/log/syslog:Jul 28 22:24:37 z170itx systemd[1]: Starting Update UTMP about System Boot/Shutdown...
/var/log/syslog:Jul 28 22:24:37 z170itx systemd[1]: Started Update UTMP about System Boot/Shutdown.
/var/log/syslog:Jul 28 22:24:37 z170itx systemd[1]: Starting LXD - container startup/shutdown...
/var/log/syslog:Jul 28 22:24:37 z170itx cron[970]: (CRON) INFO (Running @reboot jobs)
/var/log/syslog:Jul 28 22:24:38 z170itx systemd[1]: Started LXD - container startup/shutdown.
/var/log/syslog:Jul 28 22:24:43 z170itx systemd[1]: Started Unattended Upgrades Shutdown.
/var/log/syslog:Jul 28 22:40:11 z170itx systemd[1]: Starting Update UTMP about System Boot/Shutdown...
/var/log/syslog:Jul 28 22:40:11 z170itx systemd[1]: Started Update UTMP about System Boot/Shutdown.
/var/log/syslog:Jul 28 22:40:11 z170itx systemd[1]: Starting LXD - container startup/shutdown...
/var/log/syslog:Jul 28 22:40:11 z170itx cron[926]: (CRON) INFO (Running @reboot jobs)
/var/log/syslog:Jul 28 22:40:12 z170itx systemd[1]: Started LXD - container startup/shutdown.
/var/log/syslog:Jul 28 22:40:17 z170itx systemd[1]: Started Unattended Upgrades Shutdown.
/var/log/syslog:Jul 28 23:04:53 z170itx systemd[1]: Starting Update UTMP about System Boot/Shutdown...
/var/log/syslog:Jul 28 23:04:53 z170itx systemd[1]: Started Update UTMP about System Boot/Shutdown.
/var/log/syslog:Jul 28 23:04:53 z170itx cron[949]: (CRON) INFO (Running @reboot jobs)
/var/log/syslog:Jul 28 23:04:53 z170itx systemd[1]: Starting LXD - container startup/shutdown...
/var/log/syslog:Jul 28 23:04:53 z170itx systemd[1]: Started LXD - container startup/shutdown.
/var/log/syslog:Jul 28 23:04:58 z170itx systemd[1]: Started Unattended Upgrades Shutdown.
/var/log/syslog:Jul 28 23:42:23 z170itx systemd[1]: Starting Update UTMP about System Boot/Shutdown...
/var/log/syslog:Jul 28 23:42:23 z170itx systemd[1]: Started Update UTMP about System Boot/Shutdown.
/var/log/syslog:Jul 28 23:42:23 z170itx cron[852]: (CRON) INFO (Running @reboot jobs)
/var/log/syslog:Jul 28 23:42:23 z170itx systemd[1]: Starting LXD - container startup/shutdown...
/var/log/syslog:Jul 28 23:42:24 z170itx systemd[1]: Started LXD - container startup/shutdown.
/var/log/syslog:Jul 28 23:42:29 z170itx systemd[1]: Started Unattended Upgrades Shutdown.
Prime95 is online now   Reply With Quote
Old 2020-07-29, 04:04   #6
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

1101010101012 Posts
Default

This looks like a problem with unattended upgrades. Remove it with sudo apt remove unattended-upgrades or disable it via sudo dpkg-reconfigure unattended-upgrades

Last fiddled with by paulunderwood on 2020-07-29 at 04:10
paulunderwood is offline   Reply With Quote
Old 2020-07-29, 04:20   #7
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×3,581 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Yes, 19 minutes like clockwork.
Ooh, I just got 32 minutes before reboot.

Trying Paul's suggestion.
Then, I'm going to unplug mobos one at a time -- my guess is it's a "NIC gone nuts" problem.
Prime95 is online now   Reply With Quote
Old 2020-07-29, 13:39   #8
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

17·197 Posts
Default

This is interesting! I started having similar troubles with a couple of i7s running Ubuntu 16.04. One of them has totally died and I haven't gone further with troubleshooting it yet - the other one has lengthened its intervals now that it has a working fan, but still reboots more than once a day. But, my machines are old and the one had a bad case fan, so I attributed the reboots to heat. I wonder. . .

OTOH, I have an i7 laptop that I suspend before bed (as I do many others), that wakes itself up during the night. I often find it already at work when I awaken the others.
EdH is offline   Reply With Quote
Old 2020-07-29, 14:01   #9
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

3×1,973 Posts
Default

Is it possible that you have a bad CPU? This happened to my wife's computer a couple of months ago. The computer worked without issues for two years then all of a sudden I could not boot into Windows. It would just restart before I got to the desktop. Thinking that Windows was corrupted, I was able to log into a command prompt and backup all personal documents. I then reinstalled Windows on another disk. I could get to the desktop on that disk, but within a minute or more it would restart with no blue screen. Took it in to nearby shop and they eventually determined that the issue was with the CPU. It was under warranty from Intel, thus I could get a new CPU at no charge. Intel could not provide that specific CPU, so they provided a discounted upgraded CPU (7700 to 9700 or something like that).
rogue is offline   Reply With Quote
Old 2020-07-30, 01:20   #10
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3,581 Posts
Default

What I've discovered thusfar: This is a 5-CPU system. One CPU has a disk, the other four netboot off the first CPU. They all connect to the same network switch, which is crammed inside the CPU case.

No problems if main CPU and netboot CPUs #1 and #4 are running.
Adding netboot CPU#2 fails.
Adding netboot CPU#3 fails.
Now I way have used the same internet cable for #2 and #3 because the cable ordinarily used for #3 isn't working so I choose a different one at random (random being one of 3 other cables as there used to be 6 netboot CPUs).

Now testing #3 with a different cable.

New working theory: Recent work on the machine jostled or loosened one (or more) of the Internet cables causing noise on one of the lines. The OS is trying to process the noise and falling behind, until 19 minutes later a buffer overflows, and the OS dies. /var/log/syslog has no entries right before each reboot.
Prime95 is online now   Reply With Quote
Old 2020-07-30, 03:53   #11
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3,581 Posts
Default

netboot cpu #3 on a different cable: 2.5 hours and going strong
Prime95 is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
a very strange crash: mfaktc cleared worktodo.txt but results are missing ixfd64 GPU Computing 0 2019-10-09 22:52
Windows Subsystem for Linux v2 gets real Linux kernel tServo Software 0 2019-05-07 16:59
gmp-ecm crash yoyo GMP-ECM 26 2011-06-01 06:31
GMP-ECM crash lavalamp GMP-ECM 55 2011-04-03 01:58
Crash! storm5510 Software 8 2009-08-31 02:07

All times are UTC. The time now is 23:16.

Mon Sep 28 23:16:17 UTC 2020 up 18 days, 20:27, 0 users, load averages: 2.03, 1.74, 1.68

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.