QEMU with PCI Passthrough

Kernels & Hardware, configuring network, installing services

QEMU with PCI Passthrough

Postby mike.3 » 2019-11-19 09:22

Howdy,
I am struggling to make the pci passthrough work. My GPU is:
Code: Select all
08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Redwood PRO [Radeon HD 5550/5570/5630/6510/6610/7570]
08:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Redwood HDMI Audio [Radeon HD 5000 Series]

From what I know, the graphic card does not support UEFI, so I am using BIOS option.
I have the GPU isolated in single iommu group:
Code: Select all
IOMMU Group 18:
   08:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Redwood PRO [Radeon HD 5550/5570/5630/6510/6610/7570] [1002:68d9]
   08:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Redwood HDMI Audio [Radeon HD 5000 Series] [1002:aa60]

The vfio loads properly:
Code: Select all
08:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Redwood PRO [Radeon HD 5550/5570/5630/6510/6610/7570] [1002:68d9]
        Subsystem: ASUSTeK Computer Inc. Redwood PRO [Radeon HD 5550/5570/5630/6510/6610/7570] [1043:036c]
        Kernel driver in use: vfio-pci
        Kernel modules: radeon
08:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Redwood HDMI Audio [Radeon HD 5000 Series] [1002:aa60]
        Subsystem: ASUSTeK Computer Inc. Redwood HDMI Audio [Radeon HD 5000 Series] [1043:aa60]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

From kernel log:
Code: Select all
Nov 17 17:51:45 crystal kernel: [    1.085985] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Nov 17 17:51:45 crystal kernel: [    1.103149] vfio_pci: add [1002:68d9[ffffffff:ffffffff]] class 0x000000/00000000
Nov 17 17:51:45 crystal kernel: [    1.123204] vfio_pci: add [1002:aa60[ffffffff:ffffffff]] class 0x000000/00000000
Nov 17 17:51:45 crystal kernel: [    1.127650] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none

But when I run the machine via virt-manager, the virtual Windows has the graphic card listed in the device manager, but throws an error code and indicates there are no resources for the device.

And here are my main questions:
- should I run qemu as a regular user or as root?
On one hand, the documentation says, it should not be run as root. But then the problem raises with permissions to the devices:
Code: Select all
2019-11-19T09:58:33.486041Z qemu-system-x86_64: -drive file=/dev/mapper/sda4_crypt,format=raw,if=none,id=drive-sata0-0-1,cache=none,aio=native: Could not open '/dev/mapper/sda4_crypt': Permission denied

or
Code: Select all
root@crystal:~# ls -la /dev/vfio/
razem 0
drwxr-xr-x  2 root root       80 lis 19 03:10 .
drwxr-xr-x 19 root root     3520 lis 19 10:57 ..
crw-------  1 root root 245,   0 lis 19 03:11 18
crw-rw-rw-  1 root root  10, 196 lis 19 03:11 vfio

I think I could create udev rules for the drivers, but I am not sure if that's the right way to follow. For temporary I "chowned' the drivers (mapped drive and vfio) to the user, and it throws another problem, which is memory problems:
Code: Select all
2019-11-19T10:04:31.640599Z qemu-system-x86_64: -device vfio-pci,host=08:00.0,id=hostdev0,bus=pci.3,addr=0x0: VFIO_MAP_DMA: -12
2019-11-19T10:04:31.640620Z qemu-system-x86_64: -device vfio-pci,host=08:00.0,id=hostdev0,bus=pci.3,addr=0x0: vfio_dma_map(0x5596d6150a00, 0x0, 0x7d000000, 0x7ff862e00000) = -12 (Cannot allocate memory)
2019-11-19T10:04:31.640820Z qemu-system-x86_64: -device vfio-pci,host=08:00.0,id=hostdev0,bus=pci.3,addr=0x0: vfio 0000:08:00.0: failed to setup container for group 18: memory listener initialization failed for container: Cannot allocate memory

In regards to the above, I think it may be the answer to what I see under guest VM - there are no resources to the isolated GPU i am trying to pass.
Further, some "howtos" points setting the hugepages should be done. I've found, that hugepages are not enabled in Debian by default (https://wiki.debian.org/Hugepages), so I've started tweaking the system, and so far:
Code: Select all
root@crystal:~# cat /proc/meminfo |grep Huge
AnonHugePages:    307200 kB
ShmemHugePages:        0 kB
HugePages_Total:    4300
HugePages_Free:     4300
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:         8806400 kB

I have 16GB in total. I've tried also a nvidia graphic card, same story.
I am truly stuck trying to make it work for last 5 days, so any help would be appreciated.
mike.3
 
Posts: 1
Joined: 2019-11-19 08:44

Re: QEMU with PCI Passthrough

Postby CwF » 2019-11-20 15:02

mike.3 wrote: the virtual Windows has the graphic card listed in the device manager, but throws an error code and indicates there are no resources for the device

This is where you start. What windows version, what error?

Generally, the less the host knows of the 'thing' in the slot the better. Any effort to manipulate enumeration on the host, permissions on the host, etc, causes issues that lead to out of scope issues. If it appears in the vm, host issues are over.

With that said, not all configurations enumerate in the guest at the proper address. This is likely your case. For example, on one machine of mine, an amd works in one slot and not in another for windows and works in both cases for a debian vm. The solution is the definition in the vm's xml to assign the devices address. That's not a portable solution, so I don't do it. By portable, I mean the ability to move the hardware, image, and xml to another computer and have it work without modification. Yes that works, and when it doesn't I change the hardware!

Do you have other vms working? A debian vm working with the passed hardware would be the test, there you need an xorg hint file to specify the busid of the gpu. If that works, and the windows error code is 10 or 20 you need to define the address in the vm xml, for which I can't help, as mentioned. If windows is error 43, as in an nvidia (not the same story), there are fixes in the xml too. I avoid both hacks. The solution is a different slot, or a better motherboard.

Resist all temptation to assume the host has something to do with the device's drivers, it doesn't. Anything you do to 'adjust' the host simply mucks up the real problem. The host sees a 'thing' with a pciid, vfio grabs that id and traces it back to the root port. You got that, your done.

Hugepages is a tangent, and not a universal need. It does commit more memory on the host and complicates memory ballooning.

Your user need to be in the right groups and enabled with sudo or other. If your 'virt-manager' is working, you're there. Once passed, the device is not subject to permissions on the host, so kill that thinking...You may have created a tangle there I can't address.

Generally, in my hundred examples, nvidia's rarely work (intentional code 43) and you need to hide the fact it is a vm in the xml. Quadro's almost always work with the pro drivers. AMD's seem to have issue with the address, so code 10, 20 means try another slot or define in the xml. AMD pro stuff (firepro) seems to act like quadro's and don't care what address the card is at.

I can't stress enough to test the config with a debain. Just because windows doesn't work doesn't mean it's a host issue. Your device shows a clean iommu group and is properly under vfio driver control, should be good, verify with a working linux OS image.
CwF
 
Posts: 511
Joined: 2018-06-20 15:16


Return to System configuration

Who is online

Users browsing this forum: bester69 and 13 guests

fashionable