Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Non-boot RAID5 array falls apart after reboot

Linux Kernel, Network, and Services configuration.
Post Reply
Message
Author
icefield
Posts: 5
Joined: 2005-10-23 21:15

Non-boot RAID5 array falls apart after reboot

#1 Post by icefield »

I've been trying for a few days to get a RADI5 array to "survive" a reboot of the system.

I'm running Debian Sarge with the Linux 2.6.8-2-386 kernel

System has one PATA disk containing all the system and boot partitions with 4 SATA devices earmarked for the RAID array. Each 200GB SATA disk has (for the purposes of testing) a single 2GB partition with type fd.

Here's the procedure used to create the array:

(1) mknod /dev/md0 b 9 0
(2) mdadm -C /dev/md0 -n4 -l5 /dev/sd[abcd]1
(3) cat /proc/mdstat shows creation progress and finally shows:
Personalities : [raid5]
md0 : active raid5 sdd1[3] sdc1[2] sdb1[1] sda1[0]
5879424 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>
(4) mkfs.ext3 /dev/md0
-completes OK.
(5) mount /dev/md0 /mnt/x
- seems to work OK; lost+found created.
- I can write data to the disk, open files, etc. Everything seems OK.
(6) placed in /etc/fstab:
/dev/md0 /mnt/x ext3 defaults 0 2

On reboot, I get the error message:

fsck.ext3: Invalid argument while trying to open /dev/md0
/deb/md0:
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>

When booted, the md0 seems to have disappeared. cat mdstat shows nothing.

I have searched around for others having this problem and have tried adding "M md0 b 9 0" to /etc/udev/link.conf and "sleep 10" to /etc/init.d/checkfs.sh, but nothing changes.

Any ideas? Am I missing something? Are RAID arrays supposed to be this ephemeral :? ?

anon

#2 Post by anon »

I have no experience whatsoever with raid but reading http://www.tldp.org/HOWTO/Software-RAID ... html#ss5.9 it mentions "persistent-superblock option in the /etc/raidtab".

icefield
Posts: 5
Joined: 2005-10-23 21:15

#3 Post by icefield »

Unfortunately, raidtools is being dropped in favour of mdadm. Raidtools is not part of Sarge. But then again, there are almost no instructions out there on how to deal with RAID arrays under mdadm alone.

I must be missing something rather obvious because I'm sure disappearing disks would have been noticed during testing :)

anon

#4 Post by anon »

Then perhaps this helps: http://bugs.debian.org/287415
(I'm thinking of going soft-raid-5 myself so I have more than a casual interest).

Jeroen
Debian Developer, Site Admin
Debian Developer, Site Admin
Posts: 483
Joined: 2004-04-06 18:19
Location: Utrecht, NL
Contact:

#5 Post by Jeroen »

Create a mdadm.conf to start your raid array on boot. The array is not started automatically on linux, some startup script needs to do so.

See man mdadm for more info. Particularly, your mdadm.conf should probably look something like:

Code: Select all

DEVICE /dev/sd??*
ARRAY /dev/md0 auto=yes level=raid5 num-devices=4 UUID=1ba4aede:cb6ac100:45ccc59c:44145aba
Do "mdadm --detail --scan | grep ^ARRAY" to get the nontrivial part out of mdadm. The /etc/init.d/mdadm-raid init script will then start your arrays based on that config file on system bootup. Check also /etc/default/mdadm if that doesn't work.

icefield
Posts: 5
Joined: 2005-10-23 21:15

#6 Post by icefield »

Thanks Jeroen,

I tried building the mdadm.conf file, but I get this error message during boot:

md: md driver 0.90.0
Starting raid devices: md: md0 stopped.
mdadm: no devices found for /dev/md0
done.

then a bit later comes the fsck error.

The sleep 10 that I put into /etc/init.d/checkfs.sh appears to occur after the above mdadm error.

What should I be doing in /etc/default/mdadm ?

I've seen some references to problems with interactions between udev

Jeroen
Debian Developer, Site Admin
Debian Developer, Site Admin
Posts: 483
Joined: 2004-04-06 18:19
Location: Utrecht, NL
Contact:

#7 Post by Jeroen »

In /etc/default/mdadm, you should check the autostart option to see if it is set to 'yes'. Anyway, something fishy seems to go on -- check for example whether manually invoking /etc/init.d/mdadm-raid start does anything useful, or not. Check whether the /dev/sd* devices you expect to exist really do exist. If that doesn't work, debug.

If that does work, check whether the same is the case during the initial /etc/init.d/mdadm-raid startup during boot. Insert a 'bash' in the init script to get yourself a root shell during boot to examine the situation, and exit the shell to continue booting afterwards. Check the order of the relevant bootup scripts (udev, mdadm-raid, checkfs). Post them to the forum, along with your mdadm.conf file and the output of mdadm -A -s -a.

Post Reply