Help with diagnosing an OS freeze
Help with diagnosing an OS freeze
Hi all,
I am recently experiencing an issue with Debian 12. The system occasionally freezes completely (nothing is responsive, audio stuck in a loop of the last 2 seconds and SysRq does not work) so I am forced to do a hard reset. This happens quite infrequently (maybe every 2-3 weeks) at seemingly random times.
Long story short...the most difficult scenario one could possibly imagine to diagnose.
Now, I do not expect anyone to tell me what is wrong with my system, considering it could be anything from software to harware to configuration to BIOS (I use coreboot) to power issues and so on.
What I hope is that somebody could help me to set up Debian so that when this happens again I can gather as many logs as possible to at least get some clues where to start troubleshooting. Of couse, I already went through the usual /var/log options and journalctl but could not find anything useful. No logs for the time just before the crash event.
Are there any tools that can collect more logs or memory dumps before a crash?
I have read a bit about kdump and kexec, but I am having a hard time understanding how to properly set them up.
Does the capture of kernel memory even work if I have to hard reset the system and cannot reboot?
My system specs are: AMD A8-6500 APU, A88XM-E motherboard, coreboot and 8GB of DDR3 RAM.
I checked the RAM for errors and PSU for voltage outputs, everything seems fine.
I am recently experiencing an issue with Debian 12. The system occasionally freezes completely (nothing is responsive, audio stuck in a loop of the last 2 seconds and SysRq does not work) so I am forced to do a hard reset. This happens quite infrequently (maybe every 2-3 weeks) at seemingly random times.
Long story short...the most difficult scenario one could possibly imagine to diagnose.
Now, I do not expect anyone to tell me what is wrong with my system, considering it could be anything from software to harware to configuration to BIOS (I use coreboot) to power issues and so on.
What I hope is that somebody could help me to set up Debian so that when this happens again I can gather as many logs as possible to at least get some clues where to start troubleshooting. Of couse, I already went through the usual /var/log options and journalctl but could not find anything useful. No logs for the time just before the crash event.
Are there any tools that can collect more logs or memory dumps before a crash?
I have read a bit about kdump and kexec, but I am having a hard time understanding how to properly set them up.
Does the capture of kernel memory even work if I have to hard reset the system and cannot reboot?
My system specs are: AMD A8-6500 APU, A88XM-E motherboard, coreboot and 8GB of DDR3 RAM.
I checked the RAM for errors and PSU for voltage outputs, everything seems fine.
Re: Help with diagnosing an OS freeze
OK, I installed kdump-tools and used this command
to check and I have USE_KDUMP=1.
With I get LOAD_KEXEC=true
I tried to reboot as recommeded to enable the crash kernel parameter, however the system hanged on reboot (is this normal?).
After doing a hard reset, I restarted the system and checked with
I get "current state: ready to dump".
So I decided to test with
This command triggers exactly the freeze I am trying to diagnose (which I now assume is caused by a kernel panic). Nothing responds, so I have to hard reset again.
After restarting the system I check my /var/crash and there is no kernel dump there.
Am I doing something wrong or did I miss something?
Code: Select all
sudo grep USE_KDUMP /etc/default/kdump-tools
With
Code: Select all
sudo grep LOAD_KEXEC /etc/default/kexec
I tried to reboot as recommeded to enable the crash kernel parameter, however the system hanged on reboot (is this normal?).
After doing a hard reset, I restarted the system and checked with
Code: Select all
sudo kdump-config show
So I decided to test with
Code: Select all
echo c > /proc/sysrq-trigger
After restarting the system I check my /var/crash and there is no kernel dump there.
Am I doing something wrong or did I miss something?
-
- Global Moderator
- Posts: 4038
- Joined: 2014-07-20 18:12
- Location: Europe
- Has thanked: 112 times
- Been thanked: 533 times
Re: Help with diagnosing an OS freeze
Yes, that was it! I managed to make it work by enabling kdump-tools.service in Debian (in Red Hat is kdump.service).
After that I tested again with
The system rebooted to the dump kernel and I found the dump files in /var/crash.
Hope I can learn something more when the crash happens again.
After that I tested again with
Code: Select all
echo c > /proc/sysrq-trigger
The system rebooted to the dump kernel and I found the dump files in /var/crash.
Hope I can learn something more when the crash happens again.
-
- Global Moderator
- Posts: 4038
- Joined: 2014-07-20 18:12
- Location: Europe
- Has thanked: 112 times
- Been thanked: 533 times
Re: Help with diagnosing an OS freeze
A couple of days ago the freeze event happened again.
However, the system did not reboot. I suppose it gets completely frozen and the only thing I could do was a hard reset with the power button.
Unfortunately, no useful logs to help going forward.
I think it might not be a software issue, but probably hardware related.
However, the system did not reboot. I suppose it gets completely frozen and the only thing I could do was a hard reset with the power button.
Unfortunately, no useful logs to help going forward.
I think it might not be a software issue, but probably hardware related.
Re: Help with diagnosing an OS freeze
You could try some live-CD, like Debian or Knoppix to make sure. Might be good to track the temperatures also any link those (don't know if already mentioned, red just the thread, not the links)