Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

GRUB won't boot from second hard drive in RAID 1 array

If none of the specific sub-forums seem right for your thread, ask here.
Post Reply
Message
Author
debianmnb
Posts: 4
Joined: 2022-01-28 16:34

GRUB won't boot from second hard drive in RAID 1 array

#1 Post by debianmnb »

I've spent a couple of hours on this (and gone through dozens of StackExchange posts) and am still stuck.

I've got a Debian box in a software RAID 1 configuration. The machine currently boots fine.

I'm in the process of replacing the hard drives in my machine. In the process, I noticed that my boot process was actually not mirrored the way I expected. When I tried removing my first hard drive (/dev/sda) the server doesn't boot. It seems I can't boot off of (/dev/sdb) alone.

I've tried a couple of things:

sudo grub-install /dev/sdb, sudo update-grub /dev/sdb
sudo dpkg-reconfigure grub-pc (note that when I've done this, it gives me the option to install on sda, sdb, and md0. I've checked both sda and sdb and the installation appears to succeed. The installation fails though when I try and check md0.
I'll note too that it seems like GRUB is pointing at my raid array (md0), rather than the drives themselves (i.e. sda or sdb). I imagine that could be the problem since the array won't have started at the time GRUB is loading during the boot? Still, I thought I had seen that GRUB2 (which is what I believe is installed) should be able to handle a RAID array?

I'd love to hear any ideas that folks might have, and THANK YOU so much in advance! If it helps, here's my system configuration:

Code: Select all

# taken from grub.cfg
if [ x$feature_default_font_path = xy ] ; then
   font=unicode
else
insmod part_msdos
insmod part_msdos
insmod diskfilter
insmod mdraid09
insmod ext2
set root='mduuid/73f4f8fa4b4d9ea4dbaa835b9c9612ac'
if [ x$feature_platform_search_hint = xy ]; then
  search --no-floppy --fs-uuid --set=root --hint='mduuid/73f4f8fa4b4d9ea4dbaa835b9c9612ac'  20fb3f71-3911-4f65-8773-7cf6bf334
e0d
else
  search --no-floppy --fs-uuid --set=root 20fb3f71-3911-4f65-8773-7cf6bf334e0d
fi
    font="/usr/share/grub/unicode.pf2"
fi

Code: Select all

~$ sudo blkid
/dev/sda1: UUID="73f4f8fa-4b4d-9ea4-dbaa-835b9c9612ac" TYPE="linux_raid_member"
/dev/sda5: UUID="882c3c60-15c3-d6f8-d399-c9bc0b1041c7" TYPE="linux_raid_member"
/dev/sda6: UUID="6b77ee71-58d7-20c2-e510-ea20661b7451" TYPE="linux_raid_member"
/dev/sda7: UUID="361b574d-18d0-f4a5-3076-c1e8f6240de2" TYPE="linux_raid_member"
/dev/sdb1: UUID="73f4f8fa-4b4d-9ea4-dbaa-835b9c9612ac" TYPE="linux_raid_member"
/dev/sdb5: UUID="882c3c60-15c3-d6f8-d399-c9bc0b1041c7" TYPE="linux_raid_member"
/dev/sdb6: UUID="6b77ee71-58d7-20c2-e510-ea20661b7451" TYPE="linux_raid_member"
/dev/sdb7: UUID="361b574d-18d0-f4a5-3076-c1e8f6240de2" TYPE="linux_raid_member"
/dev/md0: UUID="20fb3f71-3911-4f65-8773-7cf6bf334e0d" TYPE="ext3"
/dev/md1: TYPE="swap"
/dev/md2: UUID="089c64c1-fefc-4cb4-8b9f-f06439f6a757" TYPE="ext3"
/dev/md3: UUID="0865f8c6-95ac-4582-b7c5-9e9d02a34e8e" TYPE="ext3"

Code: Select all

~$ lsblk
NAME    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
fd0       2:0    1     4K  0 disk  
sda       8:0    0 698.7G  0 disk  
├─sda1    8:1    0  18.6G  0 part  
│ └─md0   9:0    0  18.6G  0 raid1 /
├─sda2    8:2    0     1K  0 part  
├─sda5    8:5    0   1.9G  0 part  
│ └─md1   9:1    0   1.9G  0 raid1 [SWAP]
├─sda6    8:6    0 169.4G  0 part  
│ └─md2   9:2    0 169.4G  0 raid1 /home
└─sda7    8:7    0 465.7G  0 part  
  └─md3   9:3    0 465.7G  0 raid1 /time_mac
sdb       8:16   0 931.5G  0 disk  
├─sdb1    8:17   0  18.6G  0 part  
│ └─md0   9:0    0  18.6G  0 raid1 /
├─sdb2    8:18   0     1K  0 part  
├─sdb5    8:21   0   1.9G  0 part  
│ └─md1   9:1    0   1.9G  0 raid1 [SWAP]
├─sdb6    8:22   0 169.4G  0 part  
│ └─md2   9:2    0 169.4G  0 raid1 /home
└─sdb7    8:23   0 465.7G  0 part  
  └─md3   9:3    0 465.7G  0 raid1 /time_mac
sr0      11:0    1  1024M  0 rom 

Code: Select all

$ sudo parted -l
Model: ATA SAMSUNG HD753LJ (scsi)
Disk /dev/sda: 750GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type      File system     Flags
 1      32.3kB  20.0GB  20.0GB  primary   ext3            raid
 2      20.0GB  704GB   684GB   extended
 5      20.0GB  22.0GB  1999MB  logical   linux-swap(v1)  raid
 6      22.0GB  204GB   182GB   logical   ext3            raid
 7      204GB   704GB   500GB   logical   ext3            raid


Model: ATA WDC WD1005FBYZ-0 (scsi)
Disk /dev/sdb: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type      File system     Flags
 1      32.3kB  20.0GB  20.0GB  primary   ext3            raid
 2      20.0GB  704GB   684GB   extended
 5      20.0GB  22.0GB  1999MB  logical   linux-swap(v1)  raid
 6      22.0GB  204GB   182GB   logical   ext3            raid
 7      204GB   704GB   500GB   logical   ext3            raid

Code: Select all

$ sudo grub-probe --target=device /boot/grub
/dev/md0

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: GRUB won't boot from second hard drive in RAID 1 array

#2 Post by p.H »

debianmnb wrote: 2022-01-28 16:39 When I tried removing my first hard drive (/dev/sda) the server doesn't boot. It seems I can't boot off of (/dev/sdb) alone.
What happens exactly ?
debianmnb wrote: 2022-01-28 16:39 grub-install /dev/sdb
This installs GRUB on /sdb.
debianmnb wrote: 2022-01-28 16:39 update-grub /dev/sdb
Pointless. update-grub does not take any arguments and does not care about GRUB setup.
debianmnb wrote: 2022-01-28 16:39 dpkg-reconfigure grub-pc (note that when I've done this, it gives me the option to install on sda, sdb, and md0. I've checked both sda and sdb and the installation appears to succeed.
Doing so is a necessary step to have GRUB automatically updated on both drives after each grub-pc package upgrade. It must be done after each disk replacement.

Now GRUB should be able to boot from the remaining drive. If GRUB boots in rescue mode, you may also need to fail and remove the RAID members from their respective arrays before physically removing a drive.

Code: Select all

mdadm /dev/md0 --fail /dev/sda1 # WARNING: make sure you fail the right drive!
mdadm /dev/md0 remove failed
A long time ago I observed that GRUB seems to expect to find the number of members specified in the RAID superblocks. So if a member is missing before it is removed from the other members' superblocks, GRUB won't assemble the array.

If GRUB does not even boot in rescue mode, then it is a BIOS issue.
Last edited by p.H on 2022-02-02 08:25, edited 2 times in total.

debianmnb
Posts: 4
Joined: 2022-01-28 16:34

Re: GRUB won't boot from second hard drive in RAID 1 array

#3 Post by debianmnb »

Thank you so much p.H.!

I'm getting closer thanks to your help, but not quite yet there.

I realize that I had "cheated." When I said earlier that I removed the first hard drive (/dev/sda), I had also attached the new hard drive BEFORE rebooting. Based on your suggestions, I tried booting WITHOUT connecting a hard drive (just /dev/sdb connected) and voila, the machine booted!

Interestingly, here's the new structure of the RAID with the hard drive that was previously /dev/sdb being the sole HD connected. It appears the old "/dev/sdb" has turned into "/dev/sda"?

Code: Select all

lad@server-of-stinkpot:~$ sudo lsblk
[sudo] password for lad: 
NAME    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
fd0       2:0    1     4K  0 disk  
sda       8:0    0 931.5G  0 disk  
├─sda1    8:1    0  18.6G  0 part  
│ └─md0   9:0    0  18.6G  0 raid1 /
├─sda2    8:2    0     1K  0 part  
├─sda5    8:5    0   1.9G  0 part  
│ └─md1   9:1    0   1.9G  0 raid1 [SWAP]
├─sda6    8:6    0 169.4G  0 part  
│ └─md2   9:2    0 169.4G  0 raid1 /home
└─sda7    8:7    0 465.7G  0 part  
  └─md3   9:3    0 465.7G  0 raid1 /time_mac
sr0      11:0    1  1024M  0 rom   
I went on to try "removing" the hard drive I had disconnected as suggested:

sudo mdadm /dev/md0 --remove /dev/sda1

yields the error: "mdadm: hot remove failed for /dev/sda1: Device or resource busy"

For good measure, I tried removing "sdb" too:

sudo mdadm /dev/md0 --remove /dev/sdb1 yields a "Cannot find/dev/sdb1: No such file or directory error.

I of course realize I could probably remove the hard drive /dev/sda using mdadm in my original configuration while both working hard drives are plugged in. However, I'm (perhaps too stubbornly) trying to simulate the scenario in which I suffer a catastrophic hard drive failure on one drive and cannot boot from it. Surely there's a way that I could physically remove /dev/sda, boot from the other drive (/dev/sdb), and subsequently rebuild the RAID with a new hard drive replacing /dev/sda?

Thus, does anyone have an idea for what I should try after I disconnect /dev/sda/ and boot from /dev/sdb?

Thank you again!

debianmnb
Posts: 4
Joined: 2022-01-28 16:34

Re: GRUB won't boot from second hard drive in RAID 1 array

#4 Post by debianmnb »

Sorry, I forgot to also mention that when I disconnect /dev/sda, leave /dev/sdb plugged in, and connect my new, blank hard drive using the cable that was previously attached to /dev/sdb, and attempt to boot, I don't reach any of the GRUB boot screens and instead see the error:

"DISK BOOT FAILURE, INSERT SYSTEM DISK AND PRESS ENTER"

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: GRUB won't boot from second hard drive in RAID 1 array

#5 Post by p.H »

debianmnb wrote: 2022-02-02 02:08 Interestingly, here's the new structure of the RAID with the hard drive that was previously /dev/sdb being the sole HD connected. It appears the old "/dev/sdb" has turned into "/dev/sda"?
Of course. /dev/sd* drives are named in (possibly random) discovery order, so the first and only drive is always /dev/sda. No big deal. Just do not rely on device names, they are not stable.
debianmnb wrote: 2022-02-02 02:08 I went on to try "removing" the hard drive I had disconnected as suggested:
Sorry, my suggestion was wrong. An active member cannot be removed. It must be failed first. So the right command was

Code: Select all

mdadm /dev/md0 --fail /dev/sda1
mdadm /dev/md0 --remove failed
DO NOT DO THIS with only one active member, or you will fail the whole array !
You can remove detached and failed members with

Code: Select all

mdadm /dev/md0 --remove failed detached
debianmnb wrote: 2022-02-02 02:08 I'm (perhaps too stubbornly) trying to simulate the scenario in which I suffer a catastrophic hard drive failure on one drive and cannot boot from it.
A hard drive failure should not be catastrophic with RAID 1 and the system should keep working until shut down. The failing member should automatically flagged "failed" and you can remove it with mdadm.
debianmnb wrote: 2022-02-02 02:33 when I disconnect /dev/sda, leave /dev/sdb plugged in, and connect my new, blank hard drive using the cable that was previously attached to /dev/sdb, and attempt to boot, I don't reach any of the GRUB boot screens and instead see the error:

"DISK BOOT FAILURE, INSERT SYSTEM DISK AND PRESS ENTER"
This is a BIOS error when trying to boot the new blank drive. Just tell the BIOS to boot the other drive first, or swap the drive connections.

debianmnb
Posts: 4
Joined: 2022-01-28 16:34

Re: GRUB won't boot from second hard drive in RAID 1 array

#6 Post by debianmnb »

Thank you so much again p.H. I wanted to circle back and say that my issue was finally solved. I feel quite dumb admitting this, but your last suggestion (swapping the drive connections) was all I needed to do to make things work!

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: GRUB won't boot from second hard drive in RAID 1 array

#7 Post by p.H »

Thanks for the feedback, always appreciated.

Post Reply