Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Changing hostname might break RAID/mdadm

Linux Kernel, Network, and Services configuration.
Post Reply
Message
Author
bithead
Posts: 48
Joined: 2015-10-21 02:54
Been thanked: 2 times

Changing hostname might break RAID/mdadm

#1 Post by bithead »

The first time this happened to me, I thought it must be a fluke. I scoured the net for solutions and while I found lots of ideas that got me close to a fix, I couldn't find anything that allowed me to recover. I only found one instance of someone else who changed the hostname of a linux machine and had the same result - the system was no longer able to find and mount RAID drives - and that person was also unsuccessful at fixing the problem (http://www.cac.cornell.edu/slantz/perse ... pened.html) - in the end we both resorted to a clean install.

Then it happened to me again, this time on a VM so I was able to play with it and revert to snapshots to reproduce it, and eventually find a fix for the problem. And amazingly, after the days I spent working on this the first time, the solution is incredibly simple - I'm astounded that I was unable to find anything like it posted anywhere. Here's the scoop...

Install a new system (Debian 8 in my case) and configure software-based RAID. In my case, I had 4 drives with 2 arrays: RAID 1 across 4 drives holding just the /boot partition, and RAID 5 across 3 drives with 1 spare, holding /. During the installation, lets say I gave the machine the hostname of HOSTA. During the installation, initramfs is built with that name in the /etc/mdadm/mdadm.conf file, and that name is also recorded somewhere else in the RAID configuration (maybe in the Superblock for each array?).

OK, the system is running fine, and along the way I decide that HOSTB is a more appropriate name for the machine. No problem, update /etc/hosts, /etc/hostname, and maybe a few other files (even /etc/mdadm/mdadm.conf!). Reboot and all is still OK... for awhile. Then one day it's time to update the OS software... apt-update, followed by apt-upgrade. Everything is still OK, until the next reboot - when loading initramfs, which then tries to mount the root file system, I'm greeted with:

Code: Select all

Gave up waiting for root device. Common problems:
  — Boot args (cat /proc/cmdline)
    — Check rootdelay= (did the system wait long enough?)
    — Check root= (did the system wait for the right device?)
  — Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/disk/by-uuid/054a6dcc-d280-4aff-9977 does not exist.   
Dropping to a shell!

BusyBox v.1.22.1 (Ubuntu 1:1.22.0-9deb8u1) built-in shell (ash)   
Enter 'help' for list of built-in commands.

/bin/sh: can't access tty: job control turned off
(initramfs)
As the other person who reported this problem titled his post... Software RAID: What Happened? Well, the software update included a new linux-image file among other things, and consequently, it updated initramfs. Included in that update was the new hostname for the machine, HOSTB. From the above BusyBox screen, this can be seen in the /etc/mdadm/mdadm.conf file. And as can be seen below (**), the old hostname, HOSTA, is still an integral part of the RAID configuration - it is this mismatch that is causing the problem. I literally spent days looking for the fix for this problem, which I never found, but rather stumbled on through bits and pieces that eventually fell together. And once discovered, it amazingly takes less than 10 minutes to implement! Here it is, from the above BusyBox screen going forward:

Code: Select all

(initramfs) vi /etc/mdadm.mdadm.conf

<while in vi> :%s/HOSTB/HOSTA/     # replaces all occurrences of HOSTB with HOSTA
<while in vi> :wq                  # saves the file and quits vi

(initramfs) mdadm --assemble --scan
mdadm: /dev/md/0 has been started with 4 drives.
mdadm: /dev/md/1 has been started with 3 drives and 1 spare.
(initramfs) exit                   # the system resumes the boot process and completes normally
After the system boots, a subsequent reboot will land me right back at the BusyBox screen. To avoid this, I need to make the same changes to the running system's copy of /etc/mdadm/mdadm.conf and then upate initramfs. So logged in with root access:

Code: Select all

root@HOSTB:~# vi /etc/mdadm/mdadm.conf

<while in vi> :%s/HOSTB/HOSTA/     # replaces all occurrences of HOSTB with HOSTA
<while in vi> :wq                  # saves the file and quits vi

root@HOSTB:~# update-initramfs -u -k all
In the update-initramfs command, '-u' says to update the existing initramfs, and '-k all' says to do so for all installed kernel images.

Now I can safely reboot the system with its new hostname and all updates installed.

(**) One way to see that the old hostname is still ingrained in the configuration is with the 'mdadm -D /dev/mdX' command, replacing X with the appropriate array number on your system. Here's a sample:

Code: Select all

root@HOSTB:~# mdadm -D /dev/md1
/dev/md1:
        Version : 1.2
  Creation Time : Thu Dec 29 09:02:43 2016
     Raid Level : raid5
     Array Size : 2130618368 (2031.92 GiB 2181.75 GB)
  Used Dev Size : 1065309184 (1015.96 GiB 1090.88 GB)
   Raid Devices : 3
  Total Devices : 4
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Thu Feb  9 17:25:16 2017
          State : clean 
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

           Name : HOSTA:1       # <--- here is the original hostname given to the machine during installation
           UUID : ae1ab583:e5e30fb9:e2c807a2:b5e95f42
         Events : 4831

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3

       3       8       51        -      spare   /dev/sdd3
And now... what about the next time installing updates causes initramfs to be updated? Rebooting after installing the updates will again land at the BusyBox screen. However (lesson learned?), if I edit /etc/mdadm/mdadm.conf and update initramfs, after installing updates and before rebooting, the system should come up normally on first reboot, and continue to do so until initramfs is updated again. Beware however, if installing updates remotely, making these changes before rebooting may save your job!

And finally... why are we having to deal with this at all? Is it a bug in apt-get, updating parts of initramfs that it should not? Or is the problem with mdadm, embedding a volatile hostname parameter as a permanent and irrevocable part of its configuration? Both of these questions are beyond my depth, but I post them here hoping someone whose depth is more appropriate can and will pass them along to relevant parties for consideration.

Update: I've since learned that the file update is done by a post-install script, not by apt. So it is the post-install script that needs to avoid replacing the hostname value in mdadm.conf.

Post Reply