Diagnosing lock up with Debian

If none of the more specific forums is the right place to ask

Diagnosing lock up with Debian

Postby jeffry7 » 2017-12-03 17:04

I have an older HP tower to run Linux things. I generally run this headless with XRDP, but currently have KVM hooked up because I am trying to debug a periodic lockup.

The machine has been having lock-ups for a long time and I have managed it by turning it off when I am done and back on when I need it. If I leave it running, it locks up within a couple of days.

By lock up I mean I can't connect via Remote Desktop. It does not respond to Ping. The keyboard lights do not change when you press caps lock. The display is also black, but this is probably because the screen goes black after some time to prevent burn in and then the lockup happens.

This is a stock install of the latest Debian with some libraries added. (eclipse, opencv and some related things.) I don't have any particular process running when the lockups hit. It is essentially idle waiting for me to log in.

The syslog and several other logs just show normal activity right up to the lock up. Then there is a long gap till I reboot.

I have tried a memtest. I left it running nearly 24 hours giving me several complete passes. No failures.

I have tried logging the temperatures via a cron task every minute. The temperature of the CPUs right before what I presume was a lockup were 29 and 35 C. (This is all the temp data I get. It is an older machine and I guess doesn't have as much data available.)

The weirdest clue is a garbage characters in the log file. Both my temperature log and the syslog had repeating ^@ in reverse color, which I believe means there are unprintable characters in the log files.

I am guessing that the next thing to check is the power supply, but I am hesitant to simply shotgun this by replacing the power supply without some idea if that is the problem. (How many parts do i replace before I get a new machine?)

Any ideas? Are the garbage characters a useful clue?

Well another clue occurs to me. Previous to putting the temperature test in the cron, the syslog did not have any funny characters. It just ended prior to the lockup. Now there are funny characters and they come right after the line describing that a cron job runs. The log my teperature log script creates has the same funny characters. So these are probably related.

Thanks!
jeffry7
 
Posts: 6
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

Postby Head_on_a_Stick » 2017-12-03 17:43

Use systemd's journal to analyse your problem: viewtopic.php?p=659961#p659961

See also https://www.digitalocean.com/community/ ... stemd-logs
"Sheer [program] size is often an illusion, reflecting only a need for improvement." — Kernighan and Plauger
User avatar
Head_on_a_Stick
 
Posts: 6815
Joined: 2014-06-01 17:46
Location: /dev/chair

Re: Diagnosing lock up with Debian

Postby jeffry7 » 2017-12-04 00:15

I have set up journald to persist log data. Now just to wait for the next lockup.
jeffry7
 
Posts: 6
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

Postby jeffry7 » 2017-12-04 23:00

Got another lock up. That was quick.

The journalctl listing does not show anything obvious. The only thing I saw was persistent attempts to do something with the wireless card. Since a WLAN is not configured this always fails. I have disabled the WLAN so now this won't keep coming up, but since it comes up all the time, not just before a lockup, I am skeptical that is it.

I have attached the log of activity from an hour before until the lock-up.
jeffry7
 
Posts: 6
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

Postby GarryRicketson » 2017-12-04 23:21

The forum does not accept attachments, you will need to upload the file
to a storage site and post a link to it. Many member use:
http://paste.debian.net/
User avatar
GarryRicketson
 
Posts: 4477
Joined: 2015-01-20 22:16
Location: Durango, Mexico

Re: Diagnosing lock up with Debian

Postby Bulkley » 2017-12-04 23:33

Lockups and freezes are frequently caused by hardware failure. I'd take the covers off and blow the dust out. Reseat all electrical connections including cards and memory. Check for sticking fans.

Consider that the power supply might be failing. Transistors and microprocessors don't like unstable voltages. A failing power supply can drive you to pulling your hair out.

Do a Web search about your mother board. Look for failure issues.
Bulkley
 
Posts: 5372
Joined: 2006-02-11 18:35

Re: Diagnosing lock up with Debian

Postby jeffry7 » 2017-12-06 03:37

Hi GarryRicketson

Thanks for the pastbin tip.

https://paste.debian.net/999376/
jeffry7
 
Posts: 6
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

Postby jeffry7 » 2017-12-06 03:39

Hi Bulkley

I am waiting to see if disabling the WiFi works. (I don't know why it would.)

I will try cleaning it out next. I think your are right the PS is the next logical step.

Hadn't thought about googling the MoBo. Thanks!
jeffry7
 
Posts: 6
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

Postby debiman » 2017-12-06 09:59

jeffry7 wrote:Hi GarryRicketson

Thanks for the pastbin tip.

https://paste.debian.net/999376/

what is this:
Code: Select all
Dec 04 02:08:01 blackhole CRON[5461]: pam_unix(cron:session): session closed for user root
Dec 04 02:09:01 blackhole CRON[5469]: pam_unix(cron:session): session opened for user root by (uid=0)
Dec 04 02:09:01 blackhole CRON[5470]: (root) CMD (/root/templog.sh)
Dec 04 02:09:01 blackhole CRON[5469]: pam_unix(cron:session): session closed for user root
???
it seems to be running ALLthe time, look at the timestamps, opening & closing in less than a second.
look at /root/templog.sh.
User avatar
debiman
 
Posts: 1633
Joined: 2013-03-12 07:18

Re: Diagnosing lock up with Debian

Postby acewiza » 2017-12-06 13:55

Something recently went wrong causing lockups related to gdm3 on my primary workstation. About 2 weeks ago It started randomly (2-3 times a day) halting, kicking me out to the login screen. Troubleshooting after a few days led to switching to lightdm and the problem stopped. No evidence I could find was ever logged anywhere I looked.
Nobody would ever ask questions If everyone possessed encyclopedic knowledge of the man pages.
User avatar
acewiza
 
Posts: 272
Joined: 2013-05-28 12:38
Location: Out West

Re: Diagnosing lock up with Debian

Postby jeffry7 » 2017-12-07 00:47

debiman wrote:
jeffry7 wrote:Hi GarryRicketson

Thanks for the pastbin tip.

https://paste.debian.net/999376/

what is this:
Code: Select all
Dec 04 02:08:01 blackhole CRON[5461]: pam_unix(cron:session): session closed for user root
Dec 04 02:09:01 blackhole CRON[5469]: pam_unix(cron:session): session opened for user root by (uid=0)
Dec 04 02:09:01 blackhole CRON[5470]: (root) CMD (/root/templog.sh)
Dec 04 02:09:01 blackhole CRON[5469]: pam_unix(cron:session): session closed for user root
???
it seems to be running ALLthe time, look at the timestamps, opening & closing in less than a second.
look at /root/templog.sh.


Hi debiman,
This is the cron job I put in to log the temperature. The script is just appending the output of lmsensors to a log file. It also time stamps the entry. I set the script to run once a minute.
jeffry7
 
Posts: 6
Joined: 2017-12-03 16:41


Return to General Questions

Who is online

Users browsing this forum: dendro and 3 guests

fashionable