Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

RAID superblocks being wiped on reboot

If none of the specific sub-forums seem right for your thread, ask here.
Message
Author
PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

RAID superblocks being wiped on reboot

#1 Post by PlunderingPirate9000 »

I've been experiencing a really weird issue with mdadm, raid 1 arrays.

I create an array like `sudo mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sda /dev/sdb`

The array will finish syncing, and all partitions will be there from the previous raid configuration.

I can mount the paritions, read/write to them etc.

At some point in the past I would have done `sudo mkfs.ext4 -F /dev/md0`

However after re-creating the array with the same devices, I do not repeat this step since the paritions/data are already there on the now synced drives.

I then do `sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf`.

`sudo update-initramfs -u`

`echo '/dev/md0 /mnt/md0 ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab`

and also remove previous entiries from /etc/fstab and mdadm.conf

Then, when I reboot - the array is gone.

`cat /proc/mdstat` is blank.

I can repeat the raid array creation as described above and everything comes back.

I have absolutly no clue why this is happening. The only difference between this system and other systems I have used in the past is that the drives in this system are connected via a PCI-e SAS card.

LE_746F6D617A7A69
Posts: 932
Joined: 2020-05-03 14:16
Has thanked: 7 times
Been thanked: 65 times

Re: RAID superblocks being wiped on reboot

#2 Post by LE_746F6D617A7A69 »

First, You should read the manual for mdadm, and not just "quick setup" instructions from some outdated article on the web. The manual is excellent and it contains many valuable informations ;)

Your primary problem is this command:
PlunderingPirate9000 wrote:`echo '/dev/md0 /mnt/md0 ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab`
The md0 device exists only until reboot - then it becomes md127 (by default -> if You have only one array in the system, and if You won't specify the array name in mdadm.conf)

Fundamental rule: *never* use device paths in fstab - use UUIDs

The UUID in the /etc/fstab is the filesystem UUID, not the md-array UUID.
The UUID in /etc/mdadm/mdadm.conf is the md-array UUID and not the filesystem UUID.

ext4 UUID can be checked with tune2fs -l
md-array UUID is listed by mdadm -D
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: RAID superblocks being wiped on reboot

#3 Post by p.H »

How do you know that RAID superblocks are wiped ? You did not show any direct evidence such as the output of

Code: Select all

blkid /dev/sd*
file -s /dev/sd*
wipefs /dev/sda
wipefs /dev/sdb
mdadm --detail --scan --verbose
mdadm --examine /dev/sd*
and so on.
PlunderingPirate9000 wrote:all partitions will be there from the previous raid configuration.
What partitions ? Is this a partitioned array ?
What previous RAID configuration ?

Note : using whole drives instead of partitions as RAID members is a bad idea IMO. It may conflict with a partition table.

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: RAID superblocks being wiped on reboot

#4 Post by p.H »

LE_746F6D617A7A69 wrote:The md0 device exists only until reboot - then it becomes md127 (by default -> if You have only one array in the system, and if You won't specify the array name in mdadm.conf)
In my experience, an md device is assigned a high number like md127 only if it is not defined by mdadm.conf neither in /etc and in the initramfs, and if it has a different hostname from the current system.
LE_746F6D617A7A69 wrote:*never* use device paths in fstab
Except when the device path is predictible and persistent, such as
/dev/mapper/*
/dev/disk/by-id/*
/dev/disk/by-path/*

Anyway, a wrong device path in /etc/fstab is not the cause of the md device not being assembled and present in /proc/mdstat.

LE_746F6D617A7A69
Posts: 932
Joined: 2020-05-03 14:16
Has thanked: 7 times
Been thanked: 65 times

Re: RAID superblocks being wiped on reboot

#5 Post by LE_746F6D617A7A69 »

p.H wrote:Except when the device path is predictible and persistent, such as
p.H wrote:/dev/disk/by-id/*
Yes, in most cases this will work, but it's the UUID which is a standard designed to identify devices/logical volumes.
p.H wrote:/dev/disk/by-path/*
This will *not* work if You'll add/remove some expansion card(s) -> the PCI bus numbers can change.
p.H wrote:Anyway, a wrong device path in /etc/fstab is not the cause of the md device not being assembled and present in /proc/mdstat.
Yes it can be, f.e. if conflicting array declarations are found.
p.H wrote:Note : using whole drives instead of partitions as RAID members is a bad idea IMO. It may conflict with a partition table.
We already had a short "discussion" about this topic, not so long time ago - in case of HW and software BIOS-based Raid arrays, the partition tables are created on top of partition-less raid devices, so there's no single problem with creating partition-less md arrays.
One of the reasons for using partition-based md raid arrays, is to have EFI-bootable arrays (i.e. for booting the system) and the another reason could be f.e. to have smaller volumes, which are easier to manage.
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#6 Post by PlunderingPirate9000 »

p.H wrote:How do you know that RAID superblocks are wiped ? You did not show any direct evidence such as the output of

Code: Select all

blkid /dev/sd*
file -s /dev/sd*
wipefs /dev/sda
wipefs /dev/sdb
mdadm --detail --scan --verbose
mdadm --examine /dev/sd*
and so on.

I don't know what wipefs is. Does it install with `sudo apt install wipefs`?

Here is the output of blkid:

Code: Select all

/dev/sdb: PTUUID="a2a2e184-6130-4691-9fb7-fdcdeb2c19e9" PTTYPE="gpt"
/dev/sdb1: UUID="957ef167-565e-4afc-ac01-23e8e6094294" TYPE="ext4" PARTUUID="1af5ef8f-9f67-48db-b625-9a1a516467a0"
/dev/sdc: PTUUID="4b21720b-0277-4315-aa00-288cda4ec8f8" PTTYPE="gpt"
/dev/sdc1: UUID="917a60a8-82fa-49b5-ac0b-82a3b021271d" TYPE="ext4" PARTUUID="45e77d7c-fecc-41fa-ab96-0d49268c5d71"
/dev/sdd: PTUUID="576ecf35-a453-4046-b942-b371eed85c6c" PTTYPE="gpt"
/dev/sdd1: UUID="c1ecc5ce-9ac2-4aa6-b373-66db2d586f9d" TYPE="ext4" PARTUUID="9244a088-dae2-409c-b7a0-80c89d0d03ff"
/dev/sde: PTUUID="2df74fa8-173a-44a1-bde1-96f45060bce8" PTTYPE="gpt"
/dev/sde1: UUID="73f626e3-3fff-4e0e-940a-cc3bbbb2df04" TYPE="ext4" PARTUUID="a8513384-d9dd-4577-9759-0a97ce5a8668"
/dev/sdf: PTUUID="0735d711-b073-403b-a93f-cb03b418e47c" PTTYPE="gpt"
/dev/sdf1: UUID="97c2375a-57c6-47b0-a454-80d27b0b66b1" TYPE="ext4" PARTUUID="aae2c35f-181c-4dca-9ad6-fc551c4229b7"
/dev/sdg: PTUUID="e65148e2-1cb1-4768-8fbb-a78e07c61acb" PTTYPE="gpt"
All of these should be members of various raid 1 arrays. sda is missing because I blanked the drive.
p.H wrote:
PlunderingPirate9000 wrote:all partitions will be there from the previous raid configuration.
What partitions ? Is this a partitioned array ?
What previous RAID configuration ?

Note : using whole drives instead of partitions as RAID members is a bad idea IMO. It may conflict with a partition table.
[/quote]

I guess the best thing to do is provide you a link to the article I was following.

https://www.digitalocean.com/community/ ... id-1-array

I mean perhaps the info presented in the above link is wrong. I have no idea.

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#7 Post by PlunderingPirate9000 »

Forgot to add:

Code: Select all

sudo mdadm --detail --verbose --scan
sudo mdadm --examine
etc
are all blank, or spit out "no superblock"

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#8 Post by PlunderingPirate9000 »

p.H wrote:
LE_746F6D617A7A69 wrote:The md0 device exists only until reboot - then it becomes md127 (by default -> if You have only one array in the system, and if You won't specify the array name in mdadm.conf)
In my experience, an md device is assigned a high number like md127 only if it is not defined by mdadm.conf neither in /etc and in the initramfs, and if it has a different hostname from the current system.
It does not become md127. It does not appear as md anything. It does not exist after reboot. This should be what happens, I agree, but it does not. I have no idea why. Sorry I cannot be more useful than this - I really am no expert on this subject.
p.H wrote:
LE_746F6D617A7A69 wrote:*never* use device paths in fstab
Except when the device path is predictible and persistent, such as
/dev/mapper/*
/dev/disk/by-id/*
/dev/disk/by-path/*

Anyway, a wrong device path in /etc/fstab is not the cause of the md device not being assembled and present in /proc/mdstat.
I also thought this - you might be correct. Actually since I am using a SAS pci-e card, it appears to be the case that /dev/sda is always /dev/sda. etc. I thought they should be randomized on reboot but apparently this is not the case.

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#9 Post by PlunderingPirate9000 »

LE_746F6D617A7A69 wrote:
p.H wrote:Note : using whole drives instead of partitions as RAID members is a bad idea IMO. It may conflict with a partition table.
We already had a short "discussion" about this topic, not so long time ago - in case of HW and software BIOS-based Raid arrays, the partition tables are created on top of partition-less raid devices, so there's no single problem with creating partition-less md arrays.
One of the reasons for using partition-based md raid arrays, is to have EFI-bootable arrays (i.e. for booting the system) and the another reason could be f.e. to have smaller volumes, which are easier to manage.
This went way over my head.

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#10 Post by PlunderingPirate9000 »

In summary, still after reading these replies, I have no idea what might be wrong or what I should be doing.

Where is the source of information for how to create a RAID 1 array with mdadm - because I have done this before experiencing no issues. I don't know what is different this time.

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: RAID superblocks being wiped on reboot

#11 Post by p.H »

wipefs is a standard tool which is installed by default (in util-linux or coreutils, not sure). Without option, is prints filesystem/RAID/LVM... signatures found on the target device.

blkid shows that the drives have a GPT partition table and a partition. If you use the whole drives as RAID devices, you should wipe the partition tables first because the GPT table and the RAID superblock overlap each other. If you keep the partition tables, you should use the partitions as a RAID device.

No time to read the article now, will do it later.

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#12 Post by PlunderingPirate9000 »

p.H wrote:wipefs is a standard tool which is installed by default (in util-linux or coreutils, not sure). Without option, is prints filesystem/RAID/LVM... signatures found on the target device.

blkid shows that the drives have a GPT partition table and a partition. If you use the whole drives as RAID devices, you should wipe the partition tables first because the GPT table and the RAID superblock overlap each other. If you keep the partition tables, you should use the partitions as a RAID device.

No time to read the article now, will do it later.
Ok thanks for this - it sounds like a step in the right direction.

What should I do to wipe the parition information from the drives first, to ensure there is no overlap.

I'm very confused you see because a lot of online information doesn't mention anything about this... For example please see also https://raid.wiki.kernel.org/index.php/RAID_setup#

Then of course the latter part of the question is, once I have two blanked drives, and make them into a raid device, how do I then create a parition without nuking the raid information.

I'm pretty convinced this must be what is happening - thanks for the info

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: RAID superblocks being wiped on reboot

#13 Post by p.H »

PlunderingPirate9000 wrote:What should I do to wipe the parition information from the drives first, to ensure there is no overlap
You can use wipefs with the -a option. "man wipefs" for details to erase all metadata signatures on a device.
Or write zeroes on the first MB of the drive with dd.
PlunderingPirate9000 wrote:once I have two blanked drives, and make them into a raid device, how do I then create a parition without nuking the raid information.
Why do you need to create a partition ? You can use the RAID array /dev/mdN directly as a filesystem. Just format it with mkfs.<fstype> and mount it.
If you need to divide the RAID array into several areas you can use LVM over the RAID array. This was the traditional method when RAID arrays were not partitionable. Or you can create a partition table and partition on the RAID array /dev/mdN (NOT the RAID member drives /dev/sdX).

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#14 Post by PlunderingPirate9000 »

p.H wrote:
PlunderingPirate9000 wrote:What should I do to wipe the parition information from the drives first, to ensure there is no overlap
You can use wipefs with the -a option. "man wipefs" for details to erase all metadata signatures on a device.
Or write zeroes on the first MB of the drive with dd.
Yeah again - not trying to be obtuse here but how do I install this? It's not installed by default. [Edit: it is installed by default - I was forgetting to use sudo - duh]
p.H wrote:
PlunderingPirate9000 wrote:once I have two blanked drives, and make them into a raid device, how do I then create a parition without nuking the raid information.
Why do you need to create a partition ? You can use the RAID array /dev/mdN directly as a filesystem. Just format it with mkfs.<fstype> and mount it.
If you need to divide the RAID array into several areas you can use LVM over the RAID array. This was the traditional method when RAID arrays were not partitionable. Or you can create a partition table and partition on the RAID array /dev/mdN (NOT the RAID member drives /dev/sdX).
[/quote]

Ok, I'm confused, because this is what I did before and then after restarting the raid/partition/filesystem was gone.

By the way I think what you call a partition is not what I mean by a partition, given the context of your reply.

In summary you are providing me with the same instructions as I linked in that Digital Ocean link - and this did not work - for some reason I am not sure why.

So yeah - if you give me a list of commands to run, I can run them and report back on my findings, but afaik you're giving me the same instructions I already followed.

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: RAID superblocks being wiped on reboot

#15 Post by p.H »

If you want to use whole drives /dev/sdX and /dev/sdY (not partitions) as RAID members for RAID 1 array /dev/mdN :

Code: Select all

# unmount partitions on the RAID drives (if they are mounted)
umount /dev/sdX1
umount /dev/sdY1

# erase signatures on the RAID drives and their partitions
wipefs -a /dev/sdX1
wipefs -a /dev/sdX
wipefs -a /dev/sdY1
wipefs -a /dev/sdY

# create RAID array
mdadm --create /dev/mdN --level=1 --raid-devices=2 /dev/sdX /dev/sdY

# erase signatures on the RAID array
wipefs -a /dev/mdN

# format RAID array as ext4
mkfs.ext4 /dev/mdN

LE_746F6D617A7A69
Posts: 932
Joined: 2020-05-03 14:16
Has thanked: 7 times
Been thanked: 65 times

Re: RAID superblocks being wiped on reboot

#16 Post by LE_746F6D617A7A69 »

p.H wrote:If you want to use whole drives /dev/sdX and /dev/sdY (not partitions) as RAID members for RAID 1 array /dev/mdN :

Code: Select all

# unmount partitions on the RAID drives (if they are mounted)
umount /dev/sdX1
umount /dev/sdY1

# erase signatures on the RAID drives and their partitions
wipefs -a /dev/sdX1
wipefs -a /dev/sdX
wipefs -a /dev/sdY1
wipefs -a /dev/sdY

# create RAID array
mdadm --create /dev/mdN --level=1 --raid-devices=2 /dev/sdX /dev/sdY

# erase signatures on the RAID array
wipefs -a /dev/mdN

# format RAID array as ext4
mkfs.ext4 /dev/mdN
^ While the above commands are technically correct, there are few things to consider:

1) wipefs theoretically wipes also raid superblocks, but it's better to use mdadm --zero-superblock --force /dev/sd[ab] - a more reliable method.

2) there's no need to wipe filesystem superblocks, because:
__a) You may want to re-create the array in-place without destroying the data,
__b) mkfs.whatever will erase the superblocks anyway

3) for Raid1 arrays with size<100GB use --bitmap=internal in Create mode (much faster sync operations)

4) use --assume-clean for creating the array - this disables initial sync. process - saves a *lot* of time. Initial sync. makes sense only if the array is re-created in place (so it keeps old data), and is completely useless otherwise.
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: RAID superblocks being wiped on reboot

#17 Post by p.H »

LE_746F6D617A7A69 wrote:wipefs theoretically wipes also raid superblocks, but it's better to use mdadm --zero-superblock
mdadm --zero-superblock erases only RAID superblocks. I wanted to make sure to erase every other superblock too.
LE_746F6D617A7A69 wrote:You may want to re-create the array in-place without destroying the data,
Oh no, you don't want that. In such a situation you want to wipe all that mess and make sure you start from a clean state.
LE_746F6D617A7A69 wrote:mkfs.whatever will erase the superblocks anyway
I would not rely on any mkfs to erase all previous metadata signatures. Some can be located quite far away from the beginning of the device and left untouched.

LE_746F6D617A7A69
Posts: 932
Joined: 2020-05-03 14:16
Has thanked: 7 times
Been thanked: 65 times

Re: RAID superblocks being wiped on reboot

#18 Post by LE_746F6D617A7A69 »

p.H wrote:
LE_746F6D617A7A69 wrote:You may want to re-create the array in-place without destroying the data,
Oh no, you don't want that. In such a situation you want to wipe all that mess and make sure you start from a clean state.
Yes, I don't need that - it's the OP who asked about such possibility ;)
p.H wrote:
LE_746F6D617A7A69 wrote:mkfs.whatever will erase the superblocks anyway
I would not rely on any mkfs to erase all previous metadata signatures. Some can be located quite far away from the beginning of the device and left untouched.
??? Most filesystems are using more than just one superblock, even such crap as NTFS has 2 copies of it.
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#19 Post by PlunderingPirate9000 »

p.H wrote:If you want to use whole drives /dev/sdX and /dev/sdY (not partitions) as RAID members for RAID 1 array /dev/mdN :

Code: Select all

# unmount partitions on the RAID drives (if they are mounted)
umount /dev/sdX1
umount /dev/sdY1

# erase signatures on the RAID drives and their partitions
wipefs -a /dev/sdX1
wipefs -a /dev/sdX
wipefs -a /dev/sdY1
wipefs -a /dev/sdY

# create RAID array
mdadm --create /dev/mdN --level=1 --raid-devices=2 /dev/sdX /dev/sdY

# erase signatures on the RAID array
wipefs -a /dev/mdN

# format RAID array as ext4
mkfs.ext4 /dev/mdN
This seems to work, I've rebooted and the array has come up as md127.

The only difference between this and what I was doing before is the lack of `-F` switch to mkfs.ext4, and the additional steps with wipefs. I guess not doing wipefs causes some kind of problem for some reason?

I hava then run

Code: Select all

sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
It adds this line to the file

Code: Select all

ARRAY /dev/md/sagittarius:10 metadata=1.2 name=sagittarius:10 UUID=f68adb8b:8eb45692:0224171b:0ee245a8
After rebooting the array device is still 127 - do I need to run update-initramfs -u to fix this? It should be "md10".

PlunderingPirate9000
Posts: 35
Joined: 2017-11-30 15:17

Re: RAID superblocks being wiped on reboot

#20 Post by PlunderingPirate9000 »

Actually I forgot to mention - I also did a run of dd if=/dev/zero of=... to zero the entire disk sda/sdb before recreating this array. This may have helped?

Post Reply