watchdog: BUG: soft lockup CPU x stuck on Debian Buster

Kernels & Hardware, configuring network, installing services

watchdog: BUG: soft lockup CPU x stuck on Debian Buster

Postby FoobyDeb » 2019-11-03 09:31

Greetings,

I recently upgraded my home NAS server to Ryzen; as well as switching to Debian 10. Ever since then i get soft lockups every 48/72 hours near enough.

I'm lost as to what this issue may be, since multiple things were changed (hardware, software, switch to HBA cards etc).

Is there any logs that i can check to help identify what the issue is or what i can do about it? I've read some other posts with other people saying they are having simliar issues with 4.19 kernel, but i'd say the majority of those posts seem to be when Buster was in development and not stable release. I'd hope it wasnt released with some weird kernel bug, but hey it happens. I also read that i should install firmware-linux-nonfree, which i did and rebooted to make sure it was enabled. Still having issues.

Happy to provide any logs or info if people are able to advice. I am tempted to go back to Debian 9 just to see if its stable; but its a bit of a farse getting everything installed and setup each time so i'd prefer to avoid if possible, i've done enough fettling as it is!

System specs and config, picture examples

Linux arkp-nas 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u1 (2019-09-20) x86_64 GNU/Linux
Ryzen 1700 (not overclocked, stock cooler)
Asrock x370 taichi
16GB ECC
9800GTX (only card i had lying around to stick in for display, since no-onboard graphics)
No GUI Installed
Fresh install from netinstall in case any orignial upgrade from 9 > 10. (root@arkp-nas:~# fs=$(df / | tail -1 | cut -f1 -d' ') && tune2fs -l $fs | grep created
Filesystem created: Wed Oct 16 15:38:35 201), install date code stolen from google.

Code: Select all
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:01.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
00:03.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 59)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
03:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] X370 Series Chipset USB 3.1 xHCI Controller (rev 02)
03:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] X370 Series Chipset SATA Controller (rev 02)
03:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] X370 Series Chipset PCIe Upstream Port (rev 02)
16:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
16:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
16:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
16:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port (rev 02)
19:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 02)
1a:00.0 PCI bridge: ASMedia Technology Inc. ASM1184e PCIe Switch Port
1b:03.0 PCI bridge: ASMedia Technology Inc. ASM1184e PCIe Switch Port
1b:05.0 PCI bridge: ASMedia Technology Inc. ASM1184e PCIe Switch Port
1b:07.0 PCI bridge: ASMedia Technology Inc. ASM1184e PCIe Switch Port
1e:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
21:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
22:00.0 VGA compatible controller: NVIDIA Corporation G92 [GeForce 9800 GTX / 9800 GTX+] (rev a2)
23:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
24:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function
24:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor
24:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller
25:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function
25:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)


Issue starts as this:

https://arkp.co.uk/IMG_20191101_212813.jpg

Gets worse:

https://arkp.co.uk/IMG_20191101_213030.jpg

ends up like this:

https://arkp.co.uk/IMG_20191101_214558.jpg
Last edited by FoobyDeb on 2019-11-03 15:53, edited 3 times in total.
FoobyDeb
 
Posts: 3
Joined: 2019-11-02 08:07

Re: watchdog: BUG: soft lockup CPU x stuck on Debian Buster

Postby arochester » 2019-11-03 11:28

Perhaps the OP should make use of the Preview button.

There seem to be things that do not show here.
arochester
 
Posts: 1558
Joined: 2010-12-07 19:55

Re: watchdog: BUG: soft lockup CPU x stuck on Debian Buster

Postby FoobyDeb » 2019-11-03 12:23

oh really?

I did use preview and everything is showing up for me. Odd. i'll re-do it.
FoobyDeb
 
Posts: 3
Joined: 2019-11-02 08:07

Re: watchdog: BUG: soft lockup CPU x stuck on Debian Buster

Postby Head_on_a_Stick » 2019-11-03 15:40

It would be fantastic if you could replace those massive images with some thumbnail links to hosting sites — we have forum users with limited bandwidth.

In respect of your problem, do you experience this with a "live" ISO image? That would eliminate misconfiguration as a source.

That soft lockup message can be caused by a faulty or insufficient power supply, is your machine well-provisioned in that regard?
Don't break DebianHow to report bugs

SharpBang GNU/Linux — a pre-configured Openbox/Tint2 desktop running on Debian stable
User avatar
Head_on_a_Stick
 
Posts: 10599
Joined: 2014-06-01 17:46
Location: /dev/chair

Re: watchdog: BUG: soft lockup CPU x stuck on Debian Buster

Postby FoobyDeb » 2019-11-03 15:59

Done, sorry, i did resize them to be ~500kb but happy to make them clicky.

That is actually a good shout. I could try with liveCD, i could manually configure the applications im using to replicate usage. Thanks

PSU you say? hm, its a Corsair RM550, its running 10x 4TB' HDD's, as well as orignial listed spec and 2x 2118i's. It does have a 9800GTX in there but im hoping that would be pulling too much power, given im only using it for display output, I can yank that out and buy something inexpensive to replace it.
FoobyDeb
 
Posts: 3
Joined: 2019-11-02 08:07


Return to System configuration

Who is online

Users browsing this forum: No registered users and 12 guests

fashionable