Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Grub error after upgrading VMware Compatibility from 15 to 19

Ask for help with issues regarding the Installations of the Debian O/S.
Post Reply
Message
Author
rolfzi
Posts: 4
Joined: 2022-11-15 14:13
Has thanked: 1 time

Grub error after upgrading VMware Compatibility from 15 to 19

#1 Post by rolfzi »

Hi all,

we have a strange behaviour on some of our Debian 11.5 VMs on VMWare ESX 7.0.
After upgrading the HW Compatibility from V15 to V19 the VM does not boot anymore.
The boot stops at the grub prompt und shows no menu.
If I type "exit* at the grub prompt, the ESX boot Manager is started. In this menu I can select "debian"
Now it displays the grub menu and I can boot normaly.

We have this issue on about 5% of our Debian VMs, the other VMs are booting normaly.
All VMs are installed automatically by a Vmware template created with packer.


We use UEFI boot. The /boot directory is on the LVM root volume. /boot is not a separate filesystem.
If I add a sleep step in the file /boot/efi/EFI/debian/grub.cfg, then I can see the error message

error: no such device <uuid>.

I already tried to reinstall the grub packages, filesystem checks and several options of the grub-install command.
I see no error messages, but the problem persists.

Here is some output of our configuration:

Code: Select all

root > fdisk -l /dev/sda
Disk /dev/sda: 56 GiB, 60129542144 bytes, 117440512 sectors
Disk model: Virtual disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 58F3DB57-438E-4FCF-A892-AD3ABD937DDF

Device       Start       End   Sectors  Size Type
/dev/sda1     2048   1050623   1048576  512M EFI System
/dev/sda2  1050624 117438463 116387840 55.5G Linux LVM
root >
root > df -h / /boot/efi
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LV_   24G  8.8G   14G  40% /
/dev/sda1                   511M  6.0M  506M   2% /boot/efi
root >
root > cat /boot/efi/EFI/debian/grub.cfg
search.fs_uuid bb4edbd7-8f64-4518-9d90-b8d82f9414bc root lvmid/FUkr0D-a6pq-Qgf7-cYQh-Megd-SGh3-W0HzSj/LXe7Fa-XcjH-EHeV-1Hbb-k0mR-ooW9-lYqLDq
set prefix=($root)'/boot/grub'
configfile $prefix/grub.cfg
root >
root > grub-install
Installing for x86_64-efi platform.
Installation finished. No error reported.
root >
root > efibootmgr
BootCurrent: 0006
BootOrder: 0006,0000,0001,0002,0005,0003
Boot0000* EFI Virtual disk (0.0)
Boot0001* EFI Virtual disk (1.0)
Boot0002* EFI Virtual disk (2.0)
Boot0003* EFI Network
Boot0005* EFI Internal Shell (Unsupported option)
Boot0006* debian
root >

Does anyone know this problem ?
Any hints are welcome.

Kind regards

Rolf

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#2 Post by p.H »

This may happen when there are two GRUB instances and the first one points to a location which no longer exists.
To gather some information
At the first grub> prompt, type "set" and "ls" and note the output.
At the GRUB menu, press "c" to start the GRUB shell and type again "set" and "ls" and note the output.
Compare with the previous output.
Values of most interest are $cmdpath, $prefix and $root.
Commands in the system (not GRUB shell) which may provide other useful information:

Code: Select all

blkid
efibootmgr -v
find /boot/efi

rolfzi
Posts: 4
Joined: 2022-11-15 14:13
Has thanked: 1 time

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#3 Post by rolfzi »

If I compare the output of the grub commands I see that at the first boot the variables root and prefix are not pointing to the LVM volume :

first grub prompt:

Code: Select all

prefix=(hd0, gpt1)/boot/grub
root=hd0,gpt1
2nd grub prompt:

Code: Select all

prefix=(lvmid/FUkrØD-a6pq-Qgf7-cYQh-Megd-SGh3-W0HzSj/LXe7Fa-XcjH-EHeV-1Hbb-komR-ook9-1YqLDq)/boot/grub
pxe_default_server=
root=lvmid/FUkr0D-a6pq-Qgf7-cYQh-Megd-SGh3-W0HzSj/LXe7Fa-XcjH-EHeV-1Hbb-komR-oow9-1YqLDq
Error message before first grub prompt:

Code: Select all

Welcome to GRUB!
error: no such device: bb4edbd7-8f64-4518-9d90-b8d82f9414bc.
Complete Output of executed commands:
first grub prompt

Code: Select all

grub> Is
(proc) (lvm/VolGroup00-LV_swap) (lvm/VolGroup00-LV_tmp) (lvm/VolGroup00-LV_var)
(lvm/VolGroup00-LV_) (hd0) (hd0,gpt2) (hd0, gpt1)
grub>

grub> set
?=0
check_signatures-no
cmdpath= (hd0,gpt1)/EFI/debian
color_highlight=black/light-gray
color_normal=light-gray/black
feature_200_final=y
feature_all_video_module=y
feature_chain loader_bpb=y
feature_default_font_path=y
feature_menuentry_id=y
feature_menuentry_options=y
feature_nativedisk_cmd=y
feature_nt ldr=y_
feature_platform_search_hint=y
feature_timeout_style=y
grub_cpu=x86_64
grub_platform=efi
lang=
locale_dir=
net_default_ip=(null)
net_default_mac=(null)
net_default_server=
pager=1
prefix=(hd0, gpt1)/boot/grub
pxe_default_server=
root=hd0,gpt1
secondary_locale_dir=
grub>
2nd grub prompt, after "exit" command:

Code: Select all

grub> ls
(proc) (lvm/VolGroup01-LV_home) (lvm/VolGroup01-LV_appl) (lvm/VolGroup00-LV_swap) (lvm/VolGroup00-LV_tmp)
(lvm/VolGroup00-LV_var) (lvm/VolGroup00-LV_) (hd0) (hd0, gpt2) (hd0,gpt1) (hd1) (hd2)
grub>

grub> set
?=0
check_signatures=no
cmdpath= (hd0, gpt1)/EFI/debian
color_highlight=black/light-gray
color_normal=light-gray/black
config_directory=(hd0, gpt1)/EFI/debian
config_file= (hd0, gpt1)/EFI/debian/grub.cfg
default=0
feature_200_final=y
feature_all_video_module=y
feature_chainloader_bpb=y
feature_default_font_path=y
feature_menuentry_id=y
feature_menuentry_options=y
feature_nativedisk_cmd=y
feature_ntldr=y
feature_platform_search_hint=y
feature_timeout_style=y
font=unicode
gfxmode=auto
grub_cpu=x86_64
grub_platform=efi
have_grubenv=true
lang=en_US
linux_gfx_mode=
locale_dir=(lvmid/FUkr0D-a6pq-Qgf7-cYQh-Megd-SGh3-W0HzSj/LXe7Fa-XcjH-EHeV-1Hbb-komR-oow9-1YqLDq)/boot/grub/locale
menu_color_highlight-white/blue
menu_color_normal=cyan/blue
menuentry_id_option=--id
net_default_ip=(null)
net_default_mac=(null)
net_default_server=
pager=1
prefix=(lvmid/FUkrØD-a6pq-Qgf7-cYQh-Megd-SGh3-W0HzSj/LXe7Fa-XcjH-EHeV-1Hbb-komR-ook9-1YqLDq)/boot/grub
pxe_default_server=
root=lvmid/FUkr0D-a6pq-Qgf7-cYQh-Megd-SGh3-W0HzSj/LXe7Fa-XcjH-EHeV-1Hbb-komR-oow9-1YqLDq
secondary_locale_dir=
timeout_style=menu
grub>
commands executed from booted OS:

Code: Select all

~> sudo blkid
/dev/sda1: UUID="8D0B-731F" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="cf8b56e4-f494-414e-87b2-d4d677814d68"
/dev/sda2: UUID="e6FoCV-rLi2-C6GV-3zhi-1ojq-677B-xJMVk7" TYPE="LVM2_member" PARTUUID="264e10f6-05b4-4629-973a-b5d484b626ad"
/dev/sdb: UUID="j1rtDW-9X8U-uYyt-zUXA-Vy9y-JtE7-1vAAmY" TYPE="LVM2_member"
/dev/sdc: UUID="pvfkR2-8tIj-csWv-JN9C-qkr2-dJCq-JPDeId" TYPE="LVM2_member"
/dev/mapper/VolGroup01-LV_appl: UUID="a289afb3-e284-44ed-8f97-9858cadda22d" BLOCK_SIZE="4096" TYPE="ext4"
/dev/mapper/VolGroup01-LV_home: UUID="68b4b51c-47ae-4494-9ce9-71ab9d011a8d" BLOCK_SIZE="4096" TYPE="ext4"
/dev/mapper/VolGroup00-LV_: UUID="bb4edbd7-8f64-4518-9d90-b8d82f9414bc" BLOCK_SIZE="4096" TYPE="ext4"
/dev/mapper/VolGroup00-LV_var: UUID="0fc19149-2d4a-475b-90ff-90f1ba5cd2b1" BLOCK_SIZE="4096" TYPE="ext4"
/dev/mapper/VolGroup00-LV_tmp: UUID="976ce965-6ead-40ac-9932-de38b97ef845" BLOCK_SIZE="4096" TYPE="ext4"
/dev/mapper/VolGroup00-LV_swap: UUID="31ded422-e77f-44e5-880b-3047ec90ae6d" TYPE="swap"
~>

~> sudo efibootmgr -v
BootCurrent: 0006
BootOrder: 0006,0000,0001,0002,0005,0003
Boot0000* EFI Virtual disk (0.0)        PciRoot(0x0)/Pci(0x15,0x0)/Pci(0x0,0x0)/SCSI(0,0)
Boot0001* EFI Virtual disk (1.0)        PciRoot(0x0)/Pci(0x15,0x0)/Pci(0x0,0x0)/SCSI(1,0)
Boot0002* EFI Virtual disk (2.0)        PciRoot(0x0)/Pci(0x15,0x0)/Pci(0x0,0x0)/SCSI(2,0)
Boot0003* EFI Network   PciRoot(0x0)/Pci(0x16,0x0)/Pci(0x0,0x0)/MAC(005056938149,1)
Boot0005* EFI Internal Shell (Unsupported option)       MemoryMapped(11,0xefe6018,0xf3f5017)/FvFile(c57ad6b7-0515-40a8             -9d21-551652854e37)
Boot0006* debian        HD(1,GPT,cf8b56e4-f494-414e-87b2-d4d677814d68,0x800,0x100000)/File(\EFI\debian\shimx64.efi)
 ~>

~> sudo find /boot/efi/ -ls
        1      4 drwx------   3 root     root         4096 Jan  1  1970 /boot/efi/
      122      4 drwx------   4 root     root         4096 Nov 15 15:00 /boot/efi/EFI
      125      4 drwx------   2 root     root         4096 Nov 15 15:28 /boot/efi/EFI/debian
      141    916 -rwx------   1 root     root       934240 Nov 15 15:44 /boot/efi/EFI/debian/shimx64.efi
      142   1648 -rwx------   1 root     root      1684928 Nov 15 15:44 /boot/efi/EFI/debian/grubx64.efi

 ~> sudo grub-install
Installing for x86_64-efi platform.
Installation finished. No error reported.
 ~>

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#4 Post by p.H »

The obvious difference is that drives hd1 and hd2 and LVM VolGroup01 volumes are missing. I guess that the missing LVs are on the missing drives. Smells like a UEFI firmware issue.

The error message means that GRUB could not find the root filesystem UUID, but IIUC the root filesystem is in (lvm/VolGroup00-LV_) and this volume is present. Can you check its metadata and contents from the GRUB prompt ?

Code: Select all

ls (lvm/VolGroup00-LV_)
ls (lvm/VolGroup00-LV_)/
Could it be that it has some physical extents on the missing drives ?
What is the layout of the LVs ? From bash:

Code: Select all

lsblk

rolfzi
Posts: 4
Joined: 2022-11-15 14:13
Has thanked: 1 time

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#5 Post by rolfzi »

Thanks for your reply.
Yes, some of the physical extends were on hd1.
I moved all physical extends to disk hd0 with this commands:

Code: Select all

pvmove /dev/sda2 -n LV_var
pvmove /dev/sdb -n LV_
Now the system boot works again :-)

I noticed that we have the same behaviour on our RHEL VMs:
RHEL VM with HW Version 15:

Code: Select all

Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists possible device or file completions. ESC at any time exits.
grub> Is
(proc) (hd0) (hd0,gpt2) (hd0,gpt1) (hd1) (hd2) (hd3) (cd0)
(lum/vg_restore-lv_restore) (lum/VolGroup01-LV_appl) (lum/VolGroup01-LV_home) (lum/VolGroup00-LU_) (lum/VolGroup00-LV_var) (lum/VolGroup00-LV_tmp) (lum/VolGroup00-LU_swap)
grub>
The same RHEL VM with HW Version 19:

Code: Select all

Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists possible device or file completions. ESC at any time exits.
grub> Is
(proc) (hd0) (hd0,gpt2) (hd0,gpt1) (cd0) grub>
On RHEL we have no boot issues, because /boot is a filesystem on disk hd0, outside of LVM.

I will open now a VMware Case for this.

Thank you very much for your support.
Kind regards
Rolf

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#6 Post by p.H »

Thank you for the feedback.
rolfzi wrote: 2022-11-17 14:37 On RHEL we have no boot issues, because /boot is a filesystem on disk hd0, outside of LVM.
GRUB's ls shows only two partitions. Is the EFI partition mounted on /boot instead of /boot/efi ?
rolfzi wrote: 2022-11-17 14:37 I will open now a VMware Case for this.
Can you update the topic with their answer ? (and mark the topic solved in the original post subject)

rolfzi
Posts: 4
Joined: 2022-11-15 14:13
Has thanked: 1 time

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#7 Post by rolfzi »

p.H wrote: 2022-11-18 07:55 GRUB's ls shows only two partitions. Is the EFI partition mounted on /boot instead of /boot/efi ?

On RHEL we have the LVM volume group on the second disk.
This two partitions are /boot and /boot/efi

Code: Select all

> sudo lsblk
NAME                    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                       8:0    0    3G  0 disk
├─sda1                    8:1    0  600M  0 part /boot/efi
└─sda2                    8:2    0    2G  0 part /boot
sdb                       8:16   0  192G  0 disk
├─VolGroup00-LV_        253:0    0   64G  0 lvm  /
├─VolGroup00-LV_swap    253:2    0    8G  0 lvm  [SWAP]
├─VolGroup00-LV_tmp     253:5    0   16G  0 lvm  /tmp
└─VolGroup00-LV_var     253:6    0   64G  0 lvm  /var
sdc                       8:32   0  128G  0 disk
├─VolGroup01-LV_home    253:3    0   64G  0 lvm  /home
└─VolGroup01-LV_appl    253:4    0   48G  0 lvm  /appl
sdd                       8:48   0  150G  0 disk
└─vg_restore-lv_restore 253:1    0  120G  0 lvm
sr0                      11:0    1 1024M  0 rom


p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#8 Post by p.H »

Thanks for the clarification. It is weird to have a whole 3GB drive for boot only, but it's virtual so why bother...

petrvandrovec2006
Posts: 1
Joined: 2023-01-05 10:58
Been thanked: 2 times

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#9 Post by petrvandrovec2006 »

VMware UEFI implements quick boot feature, which is enabled by default from hardware version 16 onward. With quick boot, only devices that are necessary for boot loader are initialized. So if boot loader needs to access other devices (f.e. other harddisks), boot loader must connect controllers to all UEFI handles known. That will trigger discovery of devices that are not necessary for loading boot loader, making additional disks accessible to Grub (https://edk2-docs.gitbook.io/edk-ii-uef ... controller).

Quick patch below abuses lsefi command to trigger controller connection - with this patch boot should work with all hardware versions if you add command 'lsefi' to the grub script before first reference to additional harddrives.

--- grub2-2.06/grub-core/commands/efi/lsefi.c.orig 2023-01-05 02:33:44.273698142 -0800
+++ grub2-2.06/grub-core/commands/efi/lsefi.c 2023-01-05 02:35:57.158506248 -0800
@@ -107,6 +107,9 @@
grub_efi_print_device_path (dp);
}

+ status = efi_call_4 (grub_efi_system_table->boot_services->connect_controller,
+ handle, NULL, NULL, 1);
+ /* Ignore errors, it may be no driver, or already started, or whatever. */
status = efi_call_3 (grub_efi_system_table->boot_services->protocols_per_handle,
handle, &protocols, &num_protocols);
if (status != GRUB_EFI_SUCCESS) {


Another option is adding 'efi.quickBoot.enabled = "FALSE"' to the *.vmx configuration file. That is equivalent to disabling quick boot/fast boot option in various other UEFI implementations (see f.e. https://www.elevenforum.com/t/enable-or ... s-11.4922/) - in such case UEFI will initialize all devices, even if they are not necessary for the boot.

If patch above works for your environment, I can create proper grub patch that adds new command to perform device connection, rather than reusing lsefi for this task.

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: Grub error after upgrading VMware Compatibility from 15 to 19

#10 Post by p.H »

petrvandrovec2006 wrote: 2023-01-05 11:11 VMware UEFI implements quick boot feature, which is enabled by default from hardware version 16 onward. With quick boot, only devices that are necessary for boot loader are initialized. So if boot loader needs to access other devices (f.e. other harddisks), boot loader must connect controllers to all UEFI handles known.
Isn't it simpler to disable quick boot ?

Post Reply