Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

kernel watchdog: BUG: soft lockup - CPU#8 stuck for 22s - some process does not launch, one unable to kill

Linux Kernel, Network, and Services configuration.
Post Reply
Message
Author
postcd
Posts: 133
Joined: 2022-01-08 18:33
Has thanked: 48 times
Been thanked: 2 times

kernel watchdog: BUG: soft lockup - CPU#8 stuck for 22s - some process does not launch, one unable to kill

#1 Post by postcd »

Hello, after more than 24 hours of runtime Debian 11, stable kernel Linux 5.10.0-15-amd64 #1 SMP Debian 5.10.120-1

I have started getting following kind of errors on terminal:

Code: Select all

 kernel:[245409.251352] watchdog: BUG: soft lockup - CPU#8 stuck for 23s! [pqi fc5f0aa5910:3809237]
(always for this 8. CPU thread which does not seemed over-utilized shown by "htop", nor the whole CPU.)
Related dmesg command outout is here
Memory usage was like 9GB free, SWAP free was 0 bytes, like 900MB swap total /dev/dm-2 partition (i wish i had created larger swap, now do not know how to exactly increase, this drive has full disk encryption LUKS)
Full system details (inxi).
HERE how the system was utilized during the errors on terminal (dstat command).

I was unable to start some apps at this point (no error, started via GUI only unfortunately), i was unable to "sudo kill -9" one third party pid/app that had "<defunct>" suffix in process list.
Reboot command not did anything so i reset it.

Research:
1)
A 'soft lockup' is defined as a bug that causes the kernel to loop in kernel mode for more than 20 seconds without giving other tasks a chance to run.
The watchdog daemon will send an non-maskable interrupt (NMI) to all CPUs in the system who, in turn, print the stack traces of their currently running tasks.
https://www.suse.com/support/kb/doc/?id=000018705
2)
Possible swap/memory problem. ... updating the BIOS ... It's not appropriate to have too small of a swap ... free memtest
https://askubuntu.com/a/1264905/456366
It seems I had latest BIOS, even it often shows internal error during check inside BIOS. Memtest done like 1 year ago and wish i do not need to repeat unless someone believes it is really memory issue.
3)
I added acpi=off after the quiet in the startup kernel parameter. That seems to do the trick!!
viewtopic.php?p=741322#p741322
4)
kernel from the debug repository
viewtopic.php?p=699709#p699709

Questions:
Do you have idea about the cause or which, preferably exact, commands to run next time it happen so it is known where the problem was?

Thank you :)

LE_746F6D617A7A69
Posts: 932
Joined: 2020-05-03 14:16
Has thanked: 7 times
Been thanked: 65 times

Re: kernel watchdog: BUG: soft lockup - CPU#8 stuck for 22s - some process does not launch, one unable to kill

#2 Post by LE_746F6D617A7A69 »

postcd wrote: 2022-06-27 19:23 (always for this 8. CPU thread which does not seemed over-utilized shown by "htop", nor the whole CPU.)
The soft lockup is inside the kernel, and the message is sent to all kernel threads/per CPU - it can be a coincidence.
postcd wrote: 2022-06-27 19:23 Reboot command not did anything so i reset it.
Quite impossible - what arguments did You used for the reboot command? (see man reboot(8) )
postcd wrote: 2022-06-27 19:23 It seems I had latest BIOS, even it often shows internal error during check inside BIOS. Memtest done like 1 year ago and wish i do not need to repeat unless someone believes it is really memory issue.
What is the exact error message?
1 year is a very long period of time - I suggest You to re-check the RAM.
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed

Post Reply