Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

unstable freeze at boot w/ LVM on 5.18.0-3 & 5.18.0-4

If none of the specific sub-forums seem right for your thread, ask here.
Post Reply
Message
Author
Mashpot
Posts: 3
Joined: 2022-08-12 02:14

unstable freeze at boot w/ LVM on 5.18.0-3 & 5.18.0-4

#1 Post by Mashpot »

Hello,

I am using Debian Sid and recently updated to kernal 5.18.0-3-amd64 which I cannot boot into, the previous version 5.18.0-2 works fine (although I believe DirectX doesn't work now), and today I updated to 5.18.0-4 which has the same issue. The problem is that I have LVM encryption setup so no logs are made of what is going wrong, as the boot sequence seemingly freezes at a point before I am prompted to enter my LVM passphrase. I will provide an image of where it freezes below.

I tried looking for anyone else who may be having the same issue but came to no end. I did however stumble upon bootparam (boot time parameters of the Linux kernel) where the boot argument 'debug' stood out.
Kernel messages are handed off to a daemon (e.g., klogd(8) or similar) so that they may be logged to disk.
But being mindful my drive is encrypted I didn't want to push ahead, if anyone could suggest how I should go about locating this issue so I can solve it or submit a bug report, it would be highly appreciated.

Some info that may be useful:

The drive with LVM setup on it is a Samsung NVMe SSD 980 1TB although I only partitioned one 256GB block at time of installation for / besides /boot and swap (I haven't bothered increasing the size, it helps me not hoard data. When I do I will separate /home to a bigger partition.)

Code: Select all

$ lsblk -f
...
nvme0n1                                                                                   
├─nvme0n1p1 vfat     FAT32          53A4-A15D                                 505M     1% /boot/efi
├─nvme0n1p2 ext2     1.0            37ec93c9-cadb-4fe8-b9d5-65562ba7c2ab    213.6M    49% /boot
└─nvme0n1p3 crypto_L 2              f794dc9f-4cec-4c7b-a2ae-0d6ee7422fd9                  
  └─nvme0n1p3_crypt
            LVM2_mem LVM2 0         MgkABR-4ylr-jB2S-QC1d-e9ee-ZPUM-U39SCS                
    ├─gnobian--vg-root
    │       ext4     1.0            f6d20f9b-7e95-416a-92d2-ad8ef331ae3c     14.2G    88% /
    └─gnobian--vg-swap_1
            swap     1              78995ca5-eed3-461b-8342-0bc1acb5398e                  [SWAP]

Code: Select all

$ lspci
00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
00:02.0 Display controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller (rev 06)
00:14.0 USB controller: Intel Corporation 9 Series Chipset Family USB xHCI Controller
00:16.0 Communication controller: Intel Corporation 9 Series Chipset Family ME Interface #1
00:1a.0 USB controller: Intel Corporation 9 Series Chipset Family USB EHCI Controller #2
00:1c.0 PCI bridge: Intel Corporation 9 Series Chipset Family PCI Express Root Port 1 (rev d0)
00:1c.2 PCI bridge: Intel Corporation 9 Series Chipset Family PCI Express Root Port 3 (rev d0)
00:1c.3 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d0)
00:1c.4 PCI bridge: Intel Corporation 9 Series Chipset Family PCI Express Root Port 5 (rev d0)
00:1d.0 USB controller: Intel Corporation 9 Series Chipset Family USB EHCI Controller #1
00:1f.0 ISA bridge: Intel Corporation Z97 Chipset LPC Controller
00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode]
00:1f.3 SMBus: Intel Corporation 9 Series Chipset Family SMBus Controller
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller 980
03:00.0 Ethernet controller: Qualcomm Atheros Killer E220x Gigabit Ethernet Controller (rev 13)
04:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 03)
The attached picture is of 5.18.0-3 in recovery mode (selected in grub), 5.18.0-4 stops at the same log although outputs different logs prior but nothing that catches my eye. I will post 5.18.0-4 logs once I'm free to reboot. I might also mention without booting in recovery mode nothing except a usual warning shows.
Edit: Disregard my previous statement, after booting into 5.18.0-4 in recovery mode and comparing side by side both versions print the exactly same.
Attachments
5.18.0-3
5.18.0-3
bootissue5.18.0-3.jpeg (477.83 KiB) Viewed 1075 times

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: unstable freeze at boot w/ LVM on 5.18.0-3 & 5.18.0-4

#2 Post by p.H »

Last line is from i915.Try booting with "nomodeset" to disable it.

Mashpot
Posts: 3
Joined: 2022-08-12 02:14

Re: unstable freeze at boot w/ LVM on 5.18.0-3 & 5.18.0-4

#3 Post by Mashpot »

p.H wrote: 2022-08-12 17:18 Last line is from i915.Try booting with "nomodeset" to disable it.
This worked got it to boot, there still appears to be further issues but I'm on the right track.
Thanks :)

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: unstable freeze at boot w/ LVM on 5.18.0-3 & 5.18.0-4

#4 Post by p.H »

I doubt the issue has anything to do with LVM, so you may consider retitling the thread.

lspci shows that the machine has two GPUs, integrated Intel and Nvidia. Are there independent (each one has separate output ports) or linked together with Optimus technology or similar (both share the same output ports) ?
If they are independent, which is the primary/used one ? It it the Nvidia GPU ? Does it use the nouveau (free) or nvidia (non-free) drivers ?

"nomodeset" is usually not a solution, merely a diagnostic tool or crude workaround. Using it permanently may severely degrade graphic performance (resolution and speed). To selectively disable kernel modesetting with the Intel i915 driver, use i915.modeset=0.
The last kernel message mentions VT-d (I/O virtualization), you may also try to disable it in the BIOS/UEFI settings.

What is the output of

Code: Select all

lspci -nnkd ::300
lspci -nnkd ::380
when booting with an older kernel which does not have the issue and when booting with a newer kernel which has the issue ?

Mashpot
Posts: 3
Joined: 2022-08-12 02:14

Re: unstable freeze at boot w/ LVM on 5.18.0-3 & 5.18.0-4

#5 Post by Mashpot »

p.H wrote: 2022-08-14 08:38 I doubt the issue has anything to do with LVM, so you may consider retitling the thread.

lspci shows that the machine has two GPUs, integrated Intel and Nvidia. Are there independent (each one has separate output ports) or linked together with Optimus technology or similar (both share the same output ports) ?
If they are independent, which is the primary/used one ? It it the Nvidia GPU ? Does it use the nouveau (free) or nvidia (non-free) drivers ?

"nomodeset" is usually not a solution, merely a diagnostic tool or crude workaround. Using it permanently may severely degrade graphic performance (resolution and speed). To selectively disable kernel modesetting with the Intel i915 driver, use i915.modeset=0.
The last kernel message mentions VT-d (I/O virtualization), you may also try to disable it in the BIOS/UEFI settings.

What is the output of

Code: Select all

lspci -nnkd ::300
lspci -nnkd ::380
when booting with an older kernel which does not have the issue and when booting with a newer kernel which has the issue ?
Yeah I had a feeling the issue was graphics with the way it froze. Without the encrypted drive it would have booted straight in, after playing around in the BIOS to no success I tried typing my passphrase to the frozen screen and it worked, must have got it wrong when I tried before, I think pre-drive unlock it tries to use i915 but once unlocked the nvidia-driver (non-free) loads.

I did a little reading on nomodeset and did come across i915.modeset=0 too but booted in like I mentioned above before trying it. I also updated i915 firmware and disabled UEFI without success, I didn't think to disable VT-d as I use KVM but I will try disable it next. However while considering your reply and reading the BIOS manual I was reminded of Initiate Graphic Adapter (boot GPU device) which might do something as I know it is set to PCIe rather than integrated.

I am currently in the new kernel which is where the below output is from. Before when in the old kernel I did lspci and this output looks the same as old kernel, class and drivers are the same at least.

Code: Select all

$ lspci -nnkd ::380
00:02.0 Display controller [0380]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller [8086:0412] (rev 06)
	DeviceName:  Onboard IGD
	Subsystem: Micro-Star International Co., Ltd. [MSI] Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller [1462:7918]
	Kernel driver in use: i915
	Kernel modules: i915
---
$ lspci -nnkd ::300
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP104 [GeForce GTX 1080] [1043:8592]
	Kernel driver in use: nvidia
	Kernel modules: nvidia
In BIOS intel audio controller is disabled, below it tries enabling "snd_hda_intel ... enabling device (0000 -> 0002)" do the numbers in brackets represent device address? "i915 ... enabling device (0000 -> 0003)" as they overlap which makes sense as they both intergrated in CPU, I'll try enabling intel audio.

Code: Select all

$ dmesg | egrep "i915|intel|drm" 
[    0.538774] intel_pstate: Intel P-state driver initializing
[    0.909190] ACPI: bus type drm_connector registered
[    1.049834] i915 0000:00:02.0: enabling device (0000 -> 0003)
[    1.050056] i915 0000:00:02.0: [drm] VT-d active for gfx access
[    1.050430] i915 0000:00:02.0: [drm] Transparent Hugepage mode 'huge=within_size'
[    1.050440] i915 0000:00:02.0: [drm] DMAR active, disabling use of stolen memory
[    1.057738] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    1.073158] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[    1.394448] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[    1.400472] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[   23.355521] systemd[1]: Starting Load Kernel Module drm...
[   23.362607] systemd[1]: modprobe@drm.service: Deactivated successfully.
[   23.362699] systemd[1]: Finished Load Kernel Module drm.
[   23.856249] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[   23.856251] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
[   23.870929] snd_hda_intel 0000:00:03.0: enabling device (0000 -> 0002)
[   23.871041] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[   23.872655] snd_hda_intel 0000:01:00.1: enabling device (0000 -> 0002)
[   23.872723] snd_hda_intel 0000:01:00.1: Disabling MSI
[   23.872731] snd_hda_intel 0000:01:00.1: Handle vga_switcheroo audio client
[   24.192791] intel_rapl_common: Found RAPL domain package
[   24.192794] intel_rapl_common: Found RAPL domain core
[   24.192794] intel_rapl_common: Found RAPL domain uncore
[   24.192795] intel_rapl_common: Found RAPL domain dram
I could have over analyzed this but both nvidia audio and intel use the same intel driver is that normal?

Code: Select all

$ lspci -nnkd ::0403
00:03.0 Audio device [0403]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller [8086:0c0c] (rev 06)
	Subsystem: Micro-Star International Co., Ltd. [MSI] Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller [1462:7918]
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
01:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
	Subsystem: ASUSTeK Computer Inc. GP104 High Definition Audio Controller [1043:8592]
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
I was looking at my BIOS version and it's quite old compared to the latest (~3years), would you know if it's worth updating? One more thing, should I be replying with quote or does a normal comment notify the previous commenter?
Anyway I'm off to reboot and try a few things.

Post Reply