Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Diagnosing lock up with Debian

If none of the specific sub-forums seem right for your thread, ask here.
Post Reply
Message
Author
jeffry7
Posts: 26
Joined: 2017-12-03 16:41

Diagnosing lock up with Debian

#1 Post by jeffry7 »

I have an older HP tower to run Linux things. I generally run this headless with XRDP, but currently have KVM hooked up because I am trying to debug a periodic lockup.

The machine has been having lock-ups for a long time and I have managed it by turning it off when I am done and back on when I need it. If I leave it running, it locks up within a couple of days.

By lock up I mean I can't connect via Remote Desktop. It does not respond to Ping. The keyboard lights do not change when you press caps lock. The display is also black, but this is probably because the screen goes black after some time to prevent burn in and then the lockup happens.

This is a stock install of the latest Debian with some libraries added. (eclipse, opencv and some related things.) I don't have any particular process running when the lockups hit. It is essentially idle waiting for me to log in.

The syslog and several other logs just show normal activity right up to the lock up. Then there is a long gap till I reboot.

I have tried a memtest. I left it running nearly 24 hours giving me several complete passes. No failures.

I have tried logging the temperatures via a cron task every minute. The temperature of the CPUs right before what I presume was a lockup were 29 and 35 C. (This is all the temp data I get. It is an older machine and I guess doesn't have as much data available.)

The weirdest clue is a garbage characters in the log file. Both my temperature log and the syslog had repeating ^@ in reverse color, which I believe means there are unprintable characters in the log files.

I am guessing that the next thing to check is the power supply, but I am hesitant to simply shotgun this by replacing the power supply without some idea if that is the problem. (How many parts do i replace before I get a new machine?)

Any ideas? Are the garbage characters a useful clue?

Well another clue occurs to me. Previous to putting the temperature test in the cron, the syslog did not have any funny characters. It just ended prior to the lockup. Now there are funny characters and they come right after the line describing that a cron job runs. The log my teperature log script creates has the same funny characters. So these are probably related.

Thanks!

User avatar
Head_on_a_Stick
Posts: 14114
Joined: 2014-06-01 17:46
Location: London, England
Has thanked: 81 times
Been thanked: 132 times

Re: Diagnosing lock up with Debian

#2 Post by Head_on_a_Stick »

deadbang

jeffry7
Posts: 26
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

#3 Post by jeffry7 »

I have set up journald to persist log data. Now just to wait for the next lockup.

jeffry7
Posts: 26
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

#4 Post by jeffry7 »

Got another lock up. That was quick.

The journalctl listing does not show anything obvious. The only thing I saw was persistent attempts to do something with the wireless card. Since a WLAN is not configured this always fails. I have disabled the WLAN so now this won't keep coming up, but since it comes up all the time, not just before a lockup, I am skeptical that is it.

I have attached the log of activity from an hour before until the lock-up.

User avatar
GarryRicketson
Posts: 5644
Joined: 2015-01-20 22:16
Location: Durango, Mexico

Re: Diagnosing lock up with Debian

#5 Post by GarryRicketson »

The forum does not accept attachments, you will need to upload the file
to a storage site and post a link to it. Many member use:
http://paste.debian.net/

Bulkley
Posts: 6383
Joined: 2006-02-11 18:35
Has thanked: 2 times
Been thanked: 39 times

Re: Diagnosing lock up with Debian

#6 Post by Bulkley »

Lockups and freezes are frequently caused by hardware failure. I'd take the covers off and blow the dust out. Reseat all electrical connections including cards and memory. Check for sticking fans.

Consider that the power supply might be failing. Transistors and microprocessors don't like unstable voltages. A failing power supply can drive you to pulling your hair out.

Do a Web search about your mother board. Look for failure issues.

jeffry7
Posts: 26
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

#7 Post by jeffry7 »

Hi GarryRicketson

Thanks for the pastbin tip.

https://paste.debian.net/999376/

jeffry7
Posts: 26
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

#8 Post by jeffry7 »

Hi Bulkley

I am waiting to see if disabling the WiFi works. (I don't know why it would.)

I will try cleaning it out next. I think your are right the PS is the next logical step.

Hadn't thought about googling the MoBo. Thanks!

User avatar
debiman
Posts: 3063
Joined: 2013-03-12 07:18

Re: Diagnosing lock up with Debian

#9 Post by debiman »

jeffry7 wrote:Hi GarryRicketson

Thanks for the pastbin tip.

https://paste.debian.net/999376/
what is this:

Code: Select all

Dec 04 02:08:01 blackhole CRON[5461]: pam_unix(cron:session): session closed for user root
Dec 04 02:09:01 blackhole CRON[5469]: pam_unix(cron:session): session opened for user root by (uid=0)
Dec 04 02:09:01 blackhole CRON[5470]: (root) CMD (/root/templog.sh)
Dec 04 02:09:01 blackhole CRON[5469]: pam_unix(cron:session): session closed for user root
???
it seems to be running ALLthe time, look at the timestamps, opening & closing in less than a second.
look at /root/templog.sh.

User avatar
acewiza
Posts: 357
Joined: 2013-05-28 12:38
Location: Out West

Re: Diagnosing lock up with Debian

#10 Post by acewiza »

Something recently went wrong causing lockups related to gdm3 on my primary workstation. About 2 weeks ago It started randomly (2-3 times a day) halting, kicking me out to the login screen. Troubleshooting after a few days led to switching to lightdm and the problem stopped. No evidence I could find was ever logged anywhere I looked.
Nobody would ever ask questions If everyone possessed encyclopedic knowledge of the man pages.

jeffry7
Posts: 26
Joined: 2017-12-03 16:41

Re: Diagnosing lock up with Debian

#11 Post by jeffry7 »

debiman wrote:
jeffry7 wrote:Hi GarryRicketson

Thanks for the pastbin tip.

https://paste.debian.net/999376/
what is this:

Code: Select all

Dec 04 02:08:01 blackhole CRON[5461]: pam_unix(cron:session): session closed for user root
Dec 04 02:09:01 blackhole CRON[5469]: pam_unix(cron:session): session opened for user root by (uid=0)
Dec 04 02:09:01 blackhole CRON[5470]: (root) CMD (/root/templog.sh)
Dec 04 02:09:01 blackhole CRON[5469]: pam_unix(cron:session): session closed for user root
???
it seems to be running ALLthe time, look at the timestamps, opening & closing in less than a second.
look at /root/templog.sh.
Hi debiman,
This is the cron job I put in to log the temperature. The script is just appending the output of lmsensors to a log file. It also time stamps the entry. I set the script to run once a minute.

Post Reply