PCI Resource Allocation Problem

New to Debian (Or Linux in general)? Ask your questions here!

PCI Resource Allocation Problem

Postby metal83 » 2019-08-24 23:59

Hello all, I am having some difficulty getting some of my hardware to be detected and work properly with my recent installation of Debian 10.

Here's a quick overview of my system:
AMD Ryzen 5 1600
MSI B450 Tomahawk w/ latest BIOS
16GB DDR4 3200 G.Skill TridentZ
Silicon Power 256GB - NVMe M.2
Nvidia Quadro NVS 295
LSI 9211-8i HBA with Intel RES2SV240 expander
3x 10TB Western Digital
Seasonic Prime 650 Titanium SSR-650TR

I started installation via a thumb drive with the network install image. During the installation process it was not able to detect my onboard NIC. Doing some research I determined that it was a resource conflict. I removed the LSI HBA card and everything seemed to work properly. Putting the card back in, it would not work at all and I would get the following error:
Code: Select all
mpt2sas_cm0: unable to map adapter memory!


I did some research and found some people having similar problems and booting with the pci=realloc=off kernel parameter. I used that and it initially seemed to have fixed the problem. The HBA card worked and my onboard NIC was detected during installation. I noted that when the HBA card was not installed, the OS referred to my onboard NIC as enp24s0 and with the HBA card installed it was enp34s0.

So it appeared that my HBA card was working properly, I was able to use the three 10TB drives that were attached to it without any apparent problem. Upon looking at dmesg however, I have noticed the following problem:

Code: Select all
[    0.101877] pci 0000:25:00.0: BAR 9: no space for [mem size 0x00400000 64bit]
[    0.101879] pci 0000:25:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit]
[    0.101880] pci 0000:25:00.0: BAR 7: no space for [mem size 0x00040000 64bit]
[    0.101881] pci 0000:25:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit]


That device, 25:00.0, is the LSI HBA card. I've done some reading into PCI Express I/O virtualization and Base Address Registers, but most material on the subject is not very friendly to a novice like me and I am not sure how to proceed.

Any idea on what the problem is? It seems odd that these errors appear, but I haven't noticed any adverse effects when using the drives through the HBA card. Thank you.

Possibly Unrelated Information Below

I was consistently getting many errors similar to this (didn't save the exact message):

Code: Select all
[    120.101877] AMD-Vi: Event logged [IO_PAGE_FAULT device= .....................]


I initially used the iommu=soft kernel parameter and that corrected the problem, however I later changed the IOMMU option in the BIOS from auto to enabled and I haven't noticed any other problem.

I also had a stability problem under no load and the system was reporting soft/hard lockups. The PC wouldn't run longer than 24 hours without something like the following happening and forcing my to hard reset manually:

Code: Select all
kernel: [37273.944074] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
kernel: [37273.944130] rcu:      1-...!: (0 ticks this GP) idle=160/0/0x0 softirq=58051/58051 fqs=0
kernel: [37273.944183] rcu:      3-...!: (0 ticks this GP) idle=600/0/0x0 softirq=84258/84257 fqs=0
kernel: [37273.944235] rcu:      4-...!: (0 ticks this GP) idle=c94/0/0x0 softirq=88792/88792 fqs=0
kernel: [37273.944287] rcu:      5-...!: (0 ticks this GP) idle=498/0/0x0 softirq=130223/130223 fqs=0
kernel: [37273.944339] rcu:      9-...!: (1 ticks this GP) idle=394/0/0x0 softirq=76040/76040 fqs=0
kernel: [37273.944389] rcu:      10-...!: (0 ticks this GP) idle=f94/0/0x0 softirq=98626/98625 fqs=0
kernel: [37273.944440] rcu:      11-...!: (8 GPs behind) idle=628/0/0x0 softirq=87898/87898 fqs=0
kernel: [37273.944488] rcu:      (detected by 7, t=5254 jiffies, g=468889, q=20)
kernel: [37273.944532] Sending NMI from CPU 7 to CPUs 1:
kernel: [37273.944550] NMI backtrace for cpu 1 skipped: idling at acpi_idle_do_entry+0x15/0x30
kernel: [37273.945538] Sending NMI from CPU 7 to CPUs 3:
kernel: [37283.871357] Sending NMI from CPU 7 to CPUs 4:
kernel: [37293.797072] Sending NMI from CPU 7 to CPUs 5:
kernel: [37303.722787] Sending NMI from CPU 7 to CPUs 9:
kernel: [37313.648499] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1


I changed the Power Supply Idle Control in the BIOS to Common Current Idle and system stability seems to be fine (well it at least it has an uptime greater than 24 hours), however I still get the following messages in dmesg:

Code: Select all
[    0.536870] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.536941] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537032] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537109] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537189] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537272] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537326] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537374] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537447] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537518] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537592] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[    0.537667] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
metal83
 
Posts: 3
Joined: 2019-08-24 20:52

Re: PCI Resource Allocation Problem

Postby Head_on_a_Stick » 2019-08-25 17:03

Firmware & µcode?
User avatar
Head_on_a_Stick
 
Posts: 10382
Joined: 2014-06-01 17:46
Location: /dev/chair

Re: PCI Resource Allocation Problem

Postby CwF » 2019-08-25 18:08

Is that a x8 pci-e lane card? I have ran across a LSI card a few years back that worked, then gave a similar error when another card was plugged in and stole 4 lanes. Reodering solved it and I confirmed it to not work in a 4x8 slot. If it is x8, check the board is x8+. It usually doesn't-shouldn't matter. Note that when in the error state and vfio passed to a vm it did initialize correctly and could even access its bios within the vm - and not the host!
CwF
 
Posts: 446
Joined: 2018-06-20 15:16

Re: PCI Resource Allocation Problem

Postby metal83 » 2019-09-01 04:14

Head_on_a_Stick wrote:Firmware & µcode?


Firmware: 1.90
microcode: 0x8001138
metal83
 
Posts: 3
Joined: 2019-08-24 20:52

Re: PCI Resource Allocation Problem

Postby metal83 » 2019-09-01 04:21

CwF wrote:Is that a x8 pci-e lane card?


Thanks for taking the time to reply.

It is a x8 lane pci-e lane card. It is currently in a x16 physical slot that is wired at x4.

I thought it wouldn't have a problem doing it's auto negotiation thing according the the pci-e standards and run at x4 speeds.

Tomorrow I'm going to do some card rearranging and see if putting it in a properly rated slot will correct the issue.
metal83
 
Posts: 3
Joined: 2019-08-24 20:52


Return to Beginners Questions

Who is online

Users browsing this forum: No registered users and 6 guests

fashionable