RAID 5 can't mount after disk replacement

If none of the more specific forums is the right place to ask

RAID 5 can't mount after disk replacement

Postby miky76 » 2021-01-10 16:53

Hi All,
a disk broke down in my raid 5 group, so I've replaced the broken disk with a new one.
now I can't mount the disk anymore

the md0 raid 5 group was made by sdb, sdc, sdd, sdc broke down and I've replaced t with a new disk
what I'd like to do is to add the new disk to the group and rebuild everything.

can you please point me in the right directions for do that?

Code: Select all
mount /dev/md0 /raid
mount: /raid: can't read superblock on /dev/md0.


Code: Select all
 cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sdb[3](S) sdd[0](S)
      5860271024 blocks super 1.2

unused devices: <none>


thank you
miky76
 
Posts: 25
Joined: 2014-02-07 08:54

Re: RAID 5 can't mount after disk replacement

Postby p.H » 2021-01-10 17:07

This is not looking good. The array should still be active with only one member missing.
First check superblocks on the remaining members (assuming /dev/sdb and /dev/sdd).
Code: Select all
mdadm --examine /dev/sd[bd]

Then if all looks good add the new drive to the array (assuming it is /dev/sdc)
Code: Select all
mdadm --add /dev/md0 /dev/sdc
p.H
 
Posts: 1620
Joined: 2017-09-17 07:12

Re: RAID 5 can't mount after disk replacement

Postby miky76 » 2021-01-10 17:40

I'm not an expert but seem everything ok (below the output) but the add do not work
Code: Select all
 /usr/sbin/mdadm --add /dev/md0 /dev/sdc
mdadm: Cannot get array info for /dev/md0


Code: Select all
/usr/sbin# /usr/sbin/mdadm --examine /dev/sdb
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 582394e9:85ddedf1:1c3a63b5:cc9a6f0b
           Name : artemide:0  (local to host artemide)
  Creation Time : Tue Jan  7 23:16:02 2014
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
     Array Size : 5860270080 (5588.79 GiB 6000.92 GB)
  Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=944 sectors
          State : active
    Device UUID : bc71213d:a98e1b82:9d4baa5a:16de33ef

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Jan  7 18:49:06 2021
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 87986c24 - correct
         Events : 12124783

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : A.A ('A' == active, '.' == missing, 'R' == replacing)


Code: Select all
 /usr/sbin/mdadm --examine /dev/sdd
/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 582394e9:85ddedf1:1c3a63b5:cc9a6f0b
           Name : artemide:0  (local to host artemide)
  Creation Time : Tue Jan  7 23:16:02 2014
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
     Array Size : 5860270080 (5588.79 GiB 6000.92 GB)
  Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=944 sectors
          State : active
    Device UUID : dcb52e3e:c5208223:b3cdf2c0:f0d55ef0

Internal Bitmap : 8 sectors from superblock
    Update Time : Thu Jan  7 18:49:06 2021
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 917fbd95 - correct
         Events : 12124783

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : A.A ('A' == active, '.' == missing, 'R' == replacing)


in the syslog file I've found this entries
Code: Select all
Jan 10 14:26:36 artemide udisksd[520]: Error reading sysfs attr `/sys/devices/virtual/block/md0/md/degraded': Failed to open file â/sys/devices/virtual/bloc$
Jan 10 14:26:36 artemide udisksd[520]: Error reading sysfs attr `/sys/devices/virtual/block/md0/md/sync_action': Failed to open file â/sys/devices/virtual/b$
Jan 10 14:26:36 artemide udisksd[520]: Error reading sysfs attr `/sys/devices/virtual/block/md0/md/sync_completed': Failed to open file â/sys/devices/virtua$
Jan 10 14:26:36 artemide udisksd[520]: Error reading sysfs attr `/sys/devices/virtual/block/md0/md/degraded': Failed to open file â/sys/devices/virtual/bloc$
Jan 10 14:26:36 artemide udisksd[520]: Error reading sysfs attr `/sys/devices/virtual/block/md0/md/sync_action': Failed to open file â/sys/devices/virtual/b$
Jan 10 14:26:36 artemide udisksd[520]: Error reading sysfs attr `/sys/devices/virtual/block/md0/md/sync_completed': Failed to open file â/sys/devices/virtua$
miky76
 
Posts: 25
Joined: 2014-02-07 08:54

Re: RAID 5 can't mount after disk replacement

Postby p.H » 2021-01-10 18:28

Maybe you need to start the array first.
Code: Select all
mdadm --run /dev/md0
Last edited by p.H on 2021-01-10 18:34, edited 1 time in total.
p.H
 
Posts: 1620
Joined: 2017-09-17 07:12

Re: RAID 5 can't mount after disk replacement

Postby miky76 » 2021-01-10 18:32

p.H wrote:Maybe you need to start the array first.
Code: Select all
mdadadm --run /dev/md0


I get this error :(
Code: Select all

/usr/sbin/mdadm --run /dev/md0
mdadm: failed to start array /dev/md/0: Input/output error


details are
Code: Select all
/usr/sbin/mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Tue Jan  7 23:16:02 2014
        Raid Level : raid5
     Used Dev Size : 18446744073709551615
      Raid Devices : 3
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Thu Jan  7 18:49:06 2021
             State : active, FAILED, Not Started
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : unknown

              Name : artemide:0  (local to host artemide)
              UUID : 582394e9:85ddedf1:1c3a63b5:cc9a6f0b
            Events : 12124783

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed
       -       0        0        2      removed

       -       8       48        0      sync   /dev/sdd
       -       8       16        2      sync   /dev/sdb


I'm not sure why creation time is 3 day ago and not years ago and why there are 3 removed RaidDevices
Last edited by miky76 on 2021-01-10 18:37, edited 2 times in total.
miky76
 
Posts: 25
Joined: 2014-02-07 08:54

Re: RAID 5 can't mount after disk replacement

Postby p.H » 2021-01-10 18:35

Check the kernel logs with dmesg.
Edit : the array size looks awfully wrong.
p.H
 
Posts: 1620
Joined: 2017-09-17 07:12

Re: RAID 5 can't mount after disk replacement

Postby miky76 » 2021-01-10 18:39

p.H wrote:Check the kernel logs with dmesg.


I get many errors (probably the raid relevant are at the bottom)

Code: Select all
[    4.878245] pstore: Using compression: deflate
[    4.882994] pstore: crypto_comp_decompress failed, ret = -22!
[    4.883028] pstore: decompression failed: -22
[    4.902351] pstore: crypto_comp_decompress failed, ret = -22!
[    4.902385] pstore: decompression failed: -22
[    4.962922] pstore: crypto_comp_decompress failed, ret = -22!
[    4.962957] pstore: decompression failed: -22
[    5.016237] pstore: crypto_comp_decompress failed, ret = -22!
[    5.016272] pstore: decompression failed: -22
[    5.098688] pstore: crypto_comp_decompress failed, ret = -22!
[    5.098723] pstore: decompression failed: -22
[    5.107955] pstore: crypto_comp_decompress failed, ret = -22!
[    5.107956] pstore: decompression failed: -22
[    5.137800] pstore: crypto_comp_decompress failed, ret = -22!
[    5.137835] pstore: decompression failed: -22
[    5.150472] pstore: crypto_comp_decompress failed, ret = -22!
[    5.150509] pstore: decompression failed: -22
[    5.166447] pstore: crypto_comp_decompress failed, ret = -22!
[    5.166448] pstore: decompression failed: -22
[    5.176114] pstore: crypto_comp_decompress failed, ret = -22!
[    5.176116] pstore: decompression failed: -22
[    5.188467] pstore: crypto_comp_decompress failed, ret = -22!
[    5.188468] pstore: decompression failed: -22
[    5.222206] pstore: crypto_comp_decompress failed, ret = -22!
[    5.222241] pstore: decompression failed: -22
[    5.246657] pstore: crypto_comp_decompress failed, ret = -22!
[    5.246692] pstore: decompression failed: -22
[    5.253314] pstore: crypto_comp_decompress failed, ret = -22!
[    5.253349] pstore: decompression failed: -22
[    5.262549] pstore: crypto_comp_decompress failed, ret = -22!
[    5.262609] pstore: decompression failed: -22
[    5.267030] pstore: crypto_comp_decompress failed, ret = -22!
[    5.267062] pstore: decompression failed: -22
[    5.270345] pstore: crypto_comp_decompress failed, ret = -22!
[    5.270374] pstore: decompression failed: -22
[    5.274518] pstore: crypto_comp_decompress failed, ret = -22!
[    5.274550] pstore: decompression failed: -22
[    5.280942] pstore: crypto_comp_decompress failed, ret = -22!
[    5.280976] pstore: decompression failed: -22
[    5.284079] pstore: crypto_comp_decompress failed, ret = -22!
[    5.284108] pstore: decompression failed: -22
[    5.288812] pstore: crypto_comp_decompress failed, ret = -22!
[    5.288846] pstore: decompression failed: -22
[    5.292355] pstore: crypto_comp_decompress failed, ret = -22!
[    5.292385] pstore: decompression failed: -22
[    5.296725] pstore: crypto_comp_decompress failed, ret = -22!
[    5.296759] pstore: decompression failed: -22
[    5.299550] pstore: crypto_comp_decompress failed, ret = -22!
[    5.299580] pstore: decompression failed: -22
[    5.304431] pstore: crypto_comp_decompress failed, ret = -22!
[    5.304466] pstore: decompression failed: -22
[    5.309323] pstore: crypto_comp_decompress failed, ret = -22!
[    5.309358] pstore: decompression failed: -22
[    5.312909] pstore: crypto_comp_decompress failed, ret = -22!
[    5.312941] pstore: decompression failed: -22

.......

[ 1101.097768] EXT4-fs (md0): unable to read superblock
[ 1101.097894] EXT4-fs (md0): unable to read superblock
[ 1101.097966] EXT4-fs (md0): unable to read superblock
[ 1101.098005] FAT-fs (md0): unable to read boot sector
[ 1357.862958] perf: interrupt took too long (2512 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
[ 1957.015925] perf: interrupt took too long (3145 > 3140), lowering kernel.perf_event_max_sample_rate to 63500
[ 3014.024184] perf: interrupt took too long (3941 > 3931), lowering kernel.perf_event_max_sample_rate to 50750
[ 6668.493744] EXT4-fs (md0): unable to read superblock
[ 6668.493811] EXT4-fs (md0): unable to read superblock
[ 6668.493895] EXT4-fs (md0): unable to read superblock
[ 6668.493960] FAT-fs (md0): unable to read boot sector
[ 7241.579732] md/raid:md0: not clean -- starting background reconstruction
[ 7241.579762] md/raid:md0: device sdb operational as raid disk 2
[ 7241.579763] md/raid:md0: device sdd operational as raid disk 0
[ 7241.584640] md/raid:md0: cannot start dirty degraded array.
[ 7241.604669] md/raid:md0: failed to run raid set.
[ 7241.604671] md: pers->run() failed ...
[ 7254.114246] md/raid:md0: not clean -- starting background reconstruction
[ 7254.114311] md/raid:md0: device sdb operational as raid disk 2
[ 7254.114312] md/raid:md0: device sdd operational as raid disk 0
[ 7254.117831] md/raid:md0: cannot start dirty degraded array.
[ 7254.132768] md/raid:md0: failed to run raid set.
[ 7254.132770] md: pers->run() failed ...
miky76
 
Posts: 25
Joined: 2014-02-07 08:54

Re: RAID 5 can't mount after disk replacement

Postby p.H » 2021-01-10 18:57

pstore errors are not related with RAID but with EFI memory mounted as /sys/pstore or so.

The kernel seems to consider the array is dirty, but I cannot see why. If the missing drive really failed, I am afraid you have no other choice than forcing the activation with --force since the array has no more redundancy anyway, but the data may be inconsistent.
p.H
 
Posts: 1620
Joined: 2017-09-17 07:12

Re: RAID 5 can't mount after disk replacement

Postby miky76 » 2021-01-10 19:32

P.H. I think we made it...

Code: Select all
 /usr/sbin/mdadm --assemble --run --force --update=resync /dev/md0 /dev/sdb /dev/sdd
mdadm: /dev/sdb is busy - skipping
mdadm: /dev/sdd is busy - skipping

/usr/sbin/mdadm --stop /dev/md0
mdadm: stopped /dev/md0

/usr/sbin/mdadm --assemble --run --force --update=resync /dev/md0 /dev/sdb /dev/sdd
mdadm: Marking array /dev/md0 as 'clean'
mdadm: /dev/md0 has been started with 2 drives (out of 3).

cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active (auto-read-only) raid5 sdd[0] sdb[3]
      5860270080 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
      bitmap: 22/22 pages [88KB], 65536KB chunk

unused devices: <none>


now I think I should add the new disk and resync, right?
miky76
 
Posts: 25
Joined: 2014-02-07 08:54

Re: RAID 5 can't mount after disk replacement

Postby miky76 » 2021-01-10 19:37

Code: Select all
 /usr/sbin/mdadm --add /dev/md0 /dev/sdc
mdadm: added /dev/sdc


Code: Select all
cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdc[4] sdd[0] sdb[3]
      5860270080 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
      [>....................]  recovery =  0.1% (3000828/2930135040) finish=341.4min speed=142896K/sec
      bitmap: 22/22 pages [88KB], 65536KB chunk

unused devices: <none>


Probably I just need to wait now...
thank you very much p.H :D
miky76
 
Posts: 25
Joined: 2014-02-07 08:54

Re: RAID 5 can't mount after disk replacement

Postby p.H » 2021-01-11 08:26

Did you check the data consistency before adding the new drive ? Resync a corrupted array is just a waste of time. Recreate the array and restore data from backup.
p.H
 
Posts: 1620
Joined: 2017-09-17 07:12

Re: RAID 5 can't mount after disk replacement

Postby miky76 » 2021-01-11 11:51

p.H wrote:Did you check the data consistency before adding the new drive ? Resync a corrupted array is just a waste of time. Recreate the array and restore data from backup.


after resync I was browsing the disk just checking directory and file names, and everything seems ok how can I know is file are not corrupted? do I need to open them one by one manually or there is another way?

I have the backup but before of the raid disk, but before use it I would like to know if I can just copy it on the raid or I need to destroy the array and start everything from scratch?
miky76
 
Posts: 25
Joined: 2014-02-07 08:54

Re: RAID 5 can't mount after disk replacement

Postby p.H » 2021-01-11 12:20

miky76 wrote:I was browsing the disk just checking directory and file names, and everything seems ok how can I know is file are not corrupted?

You can check the filesystem metadata with fsck. But it won't check the file data.

miky76 wrote:I have the backup but before of the raid disk, but before use it I would like to know if I can just copy it on the raid or I need to destroy the array and start everything from scratch?

You do not need to recreate the array from scratch.
p.H
 
Posts: 1620
Joined: 2017-09-17 07:12


Return to General Questions

Who is online

Users browsing this forum: No registered users and 13 guests

fashionable