Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

[Solved] AMDGPU and Kernel 5.10.0-19+ Issue (Working with 5.10.0-18)

Linux Kernel, Network, and Services configuration.
Post Reply
Message
Author
idocgreen
Posts: 4
Joined: 2023-03-15 22:28

[Solved] AMDGPU and Kernel 5.10.0-19+ Issue (Working with 5.10.0-18)

#1 Post by idocgreen »

Since the update to kernel 5.10.0-19 last year; and since then my computer is refusing to boot in the GUI.
The problem I have found is that the amdgpu module is simply not loading.
However: My GPU is detected if go in Advanced options and boot the old version of the kernel: 5.10.0-18.

I have seen this bug, https://bugs.debian.org/cgi-bin/bugrepo ... ug=1022025, which is similar to my issue; however I have an Intel CPU, and the patches proposed do not work.

I have tested up to 5.10.0-21 and it still does not work.

I have inspected kernel.log and found that the 'drm' module simply isn't loading and the 'amdgpu' module either.
I'm attaching a complete diff (see attachments section)- perhaps this can help.

Below is a short version of the diff.

What would be the cause of the issue?
Any ideas?

kernel.log: Booting 5.10.0-18 - All normal - working

Code: Select all

[    0.000000] Linux version 5.10.0-18-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.140-1 (2022-09-02)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-18-amd64 root=UUID=395b803e-2132-4576-9c8d-de572c6b74c5 ro quiet
[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
.....
[    4.131268] sr 0:0:0:0: [sr0] scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray
[    4.131274] cdrom: Uniform CD-ROM driver Revision: 3.20
[    4.183746] usb 3-1.1: new full-speed USB device number 3 using ehci-pci
[    4.254689] [drm] amdgpu kernel modesetting enabled.
[    4.254913] CRAT table not found
[    4.254917] Virtual CRAT table created for CPU
[    4.254953] amdgpu: Topology: Add CPU node
[    4.255089] checking generic (e0000000 1f0000) vs hw (e0000000 10000000)
[    4.255091] fb0: switching to amdgpudrmfb from EFI VGA
[    4.255184] Console: switching to colour dummy device 80x25
[    4.255238] amdgpu 0000:03:00.0: vgaarb: deactivate vga console
[    4.255383] [drm] initializing kernel modesetting (POLARIS10 0x1002:0x67DF 0x1043:0x04C2 0xEF).
[    4.255386] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[    4.255397] [drm] register mmio base: 0xF7E00000
[    4.255398] [drm] register mmio size: 262144
[    4.255402] [drm] PCIE atomic ops is not supported
[    4.255407] [drm] add ip block number 0 <vi_common>
[    4.255409] [drm] add ip block number 1 <gmc_v8_0>
[    4.255410] [drm] add ip block number 2 <tonga_ih>
[    4.255411] [drm] add ip block number 3 <gfx_v8_0>
[    4.255412] [drm] add ip block number 4 <sdma_v3_0>
[    4.255413] [drm] add ip block number 5 <powerplay>
[    4.255415] [drm] add ip block number 6 <dm>
[    4.255416] [drm] add ip block number 7 <uvd_v6_0>
[    4.255417] [drm] add ip block number 8 <vce_v3_0>
[    4.255420] kfd kfd: skipped device 1002:67df, PCI rejects atomics
[    4.255687] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from ROM BAR
[    4.255690] amdgpu: ATOM BIOS: 115-C940PI0-100
[    4.255710] [drm] UVD is enabled in VM mode
[    4.255711] [drm] UVD ENC is enabled in VM mode
[    4.255713] [drm] VCE enabled in VM mode
[    4.255754] [drm] vm size is 256 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[    4.255815] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_mc.bin
[    4.255823] amdgpu 0000:03:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
[    4.255826] amdgpu 0000:03:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[    4.255833] [drm] Detected VRAM RAM=4096M, BAR=256M
[    4.255834] [drm] RAM width 256bits GDDR5
[    4.255995] [TTM] Zone  kernel: Available graphics memory: 41213726 KiB
[    4.255997] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[    4.255999] [TTM] Initializing pool allocator
[    4.256004] [TTM] Initializing DMA pool allocator
[    4.256470] [drm] amdgpu: 4096M of VRAM memory ready
[    4.256474] [drm] amdgpu: 4096M of GTT memory ready.
[    4.256477] [drm] GART: num cpu pages 65536, num gpu pages 65536
[    4.257340] [drm] PCIE GART of 256M enabled (table at 0x000000F4001D5000).
[    4.257501] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_pfp_2.bin
[    4.257527] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_me_2.bin
[    4.257557] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_ce_2.bin
[    4.257560] [drm] Chained IB support enabled!
[    4.257586] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_rlc.bin
[    4.257700] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_mec_2.bin
[    4.257815] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_mec2_2.bin
[    4.259515] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_sdma.bin
[    4.259544] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_sdma1.bin
[    4.259697] amdgpu: hwmgr_sw_init smu backed is polaris10_smu
[    4.259920] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_uvd.bin
[    4.259925] [drm] Found UVD firmware Version: 1.130 Family ID: 16
[    4.260821] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_vce.bin
[    4.260826] [drm] Found VCE firmware Version: 53.26 Binary ID: 3
[    4.262239] amdgpu 0000:03:00.0: firmware: direct-loading firmware amdgpu/polaris10_k_smc.bin
[    4.298090] usb 3-1.1: New USB device found, idVendor=062a, idProduct=4102, bcdDevice= 1.03
[    4.298096] usb 3-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[    4.298099] usb 3-1.1: Product: 2.4G Wireless Mouse
[    4.298101] usb 3-1.1: Manufacturer: MOSART Semi.
kernel.log: Booting 5.10.0-21 - AMDGPU not recognized - will not boot to GUI

Code: Select all

[    0.000000] microcode: microcode updated early to revision 0x71a, date = 2020-03-24
[    0.000000] Linux version 5.10.0-21-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.162-1 (2023-01-21)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-21-amd64 root=UUID=395b803e-2132-4576-9c8d-de572c6b74c5 ro quiet
....
--- > no mention of amdgpu/drm/etc.
[    4.224077] tsc: Refined TSC clocksource calibration: 1995.190 MHz
[    4.224092] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x3984dcddca4, max_idle_ns: 881590726873 ns
[    4.224200] clocksource: Switched to clocksource tsc
[    4.276161] sr 0:0:0:0: Attached scsi CD-ROM sr0
[    4.480075] usb 4-1.1: new full-speed USB device number 3 using ehci-pci
...
Attachments
5.10.0-18 vs 5.10.0-21.html.gz
(41.83 KiB) Downloaded 12 times
Last edited by idocgreen on 2023-03-18 01:48, edited 1 time in total.

User avatar
FreewheelinFrank
Global Moderator
Global Moderator
Posts: 2082
Joined: 2010-06-07 16:59
Has thanked: 38 times
Been thanked: 225 times

Re: AMDGPU and Kernel 5.10.0-19+ Issue (Working with 5.10.0-18)

#2 Post by FreewheelinFrank »

You may need to file your own bug report for this against your specific hardware as fixes for previous issues have not worked for you.

idocgreen
Posts: 4
Joined: 2023-03-15 22:28

Re: AMDGPU and Kernel 5.10.0-19+ Issue (Working with 5.10.0-18)

#3 Post by idocgreen »

Indeed I was thinking about that as well. Will open a bug report.

Aki
Global Moderator
Global Moderator
Posts: 2823
Joined: 2014-07-20 18:12
Location: Europe
Has thanked: 69 times
Been thanked: 385 times

Re: AMDGPU and Kernel 5.10.0-19+ Issue (Working with 5.10.0-18)

#4 Post by Aki »

idocgreen wrote: 2023-03-16 12:36 Indeed I was thinking about that as well. Will open a bug report.
Please, report here the bug report number (from the Debian Bug Tracking System). Let as know if you need help for the reporting process.
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄⠀

idocgreen
Posts: 4
Joined: 2023-03-15 22:28

Re: AMDGPU and Kernel 5.10.0-19+ Issue (Working with 5.10.0-18)

#5 Post by idocgreen »

I found out the culprit; probably me(??)

I was in the process of reporting the bug; and while booting with the new kernel -21 (as suggested by reportbug); I decided to
modprobe drm
and then
modprobe amdgpu
in a secondary shell.
The desktop loaded shortly after.

So that's good news; however I need to do this everytime I reboot in the new kernel.

How would I then go about making sure these load automatically?
What makes me wonder is that they will load automatically when I use the older version (-18).

So then I found out that the module was blacklisted; somehow in /etc/mobprobe.d I had a blacklist-amdgpu.conf ... that was apparently ignored in -18 (??).
I am not sure how this file got there; can't see anything in my bash history.
dpkg -S reports it's not from any package...

Perhaps an operation I did a few months ago? I do not recall!

Anyhow; Problem solved.
Thanks for lending an ear.
Last edited by idocgreen on 2023-03-18 01:13, edited 1 time in total.

idocgreen
Posts: 4
Joined: 2023-03-15 22:28

Re: AMDGPU and Kernel 5.10.0-19+ Issue (Working with 5.10.0-18)

#6 Post by idocgreen »

Actually - slight follow-up in case someone has this problem
It's probably this bug that caused the issue: https://gitlab.freedesktop.org/drm/amd/-/issues/1918
It must have shown up when I tried installing the 'pro' version of the driver.
As of now (2023-03-17), the bug is still open.

Aki
Global Moderator
Global Moderator
Posts: 2823
Joined: 2014-07-20 18:12
Location: Europe
Has thanked: 69 times
Been thanked: 385 times

Re: [Solved] AMDGPU and Kernel 5.10.0-19+ Issue (Working with 5.10.0-18)

#7 Post by Aki »

Hello,
Thanks for updating the thread and pointing to the solution (kernel module disabled in /etc/modprobe.d/blacklist-amdgpu.conf)
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄⠀

Post Reply