I have an older HP tower to run Linux things. I generally run this headless with XRDP, but currently have KVM hooked up because I am trying to debug a periodic lockup.
The machine has been having lock-ups for a long time and I have managed it by turning it off when I am done and back on when I need it. If I leave it running, it locks up within a couple of days.
By lock up I mean I can't connect via Remote Desktop. It does not respond to Ping. The keyboard lights do not change when you press caps lock. The display is also black, but this is probably because the screen goes black after some time to prevent burn in and then the lockup happens.
This is a stock install of the latest Debian with some libraries added. (eclipse, opencv and some related things.) I don't have any particular process running when the lockups hit. It is essentially idle waiting for me to log in.
The syslog and several other logs just show normal activity right up to the lock up. Then there is a long gap till I reboot.
I have tried a memtest. I left it running nearly 24 hours giving me several complete passes. No failures.
I have tried logging the temperatures via a cron task every minute. The temperature of the CPUs right before what I presume was a lockup were 29 and 35 C. (This is all the temp data I get. It is an older machine and I guess doesn't have as much data available.)
The weirdest clue is a garbage characters in the log file. Both my temperature log and the syslog had repeating ^@ in reverse color, which I believe means there are unprintable characters in the log files.
I am guessing that the next thing to check is the power supply, but I am hesitant to simply shotgun this by replacing the power supply without some idea if that is the problem. (How many parts do i replace before I get a new machine?)
Any ideas? Are the garbage characters a useful clue?
Well another clue occurs to me. Previous to putting the temperature test in the cron, the syslog did not have any funny characters. It just ended prior to the lockup. Now there are funny characters and they come right after the line describing that a cron job runs. The log my teperature log script creates has the same funny characters. So these are probably related.
Thanks!