A happy tail of a painless Linux MD RAID recovery

Once again, working away on an emergency (in what seems to be a never ending stream of urgent requests), when my beloved two year old rolls in, pops up behind the desk and asks for a red pen. I say I don’t have one. She says something about cables. “Honey, please don’t touch anything”, I yelp, remembering how last week she pulled the eSATA cable from the external disk enclosure. “Oh, look, a button”, she replies as I helplessly watch lights go off on the enclosure. “Honey, you just killed my RAID array again! I thought I said not to touch anything?”, I shrieked. She figures her work here is done and makes a run for it. This is what she does when she knows she did something wrong. She jets. It’s hilarious.

I turn the enclosure back on, computer display dims for a second, then comes back and is fully functional. Whew. Last week, when she pulled the cable, in addition to loosing the two disks in the external enclosure, one of the internal disks actually died, causing me to loose half of the six disk array. Had I been using RAID5 or RAID6, it would have been game over (restore from backup), but thanks to RAID10 and awesomeness of Linux MD RAID, I was able to recover, but did have to reboot and lost a few hours of unsaved work. So, we’re off to a better start this time.

A second later, get a nice letter from mdadm:

This is an automatically generated mail message from mdadm
running on dt

A Fail event had been detected on md device /dev/md1.

It could be related to component device /dev/sdk1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid10 sdl1[5] sdk1[6](F) sdc1[2] sda1[0] sdd1[3] sdb1[1]
      1465151808 blocks 64K chunks 2 near-copies [6/5] [UUUU_U]
      
unused devices: 

Followed by a thoughtful note from the S.M.A.R.T. daemon:

This email was generated by the smartd daemon running on:

   host name: dt
  DNS domain: [Unknown]
  NIS domain: (none)

The following warning/error was logged by the smartd daemon:

Device: /dev/sdk [SAT], unable to open device

For details see host's SYSLOG (default: /var/log/syslog).

You can also use the smartctl utility for further investigation.
No additional email messages about this problem will be sent.

Hmm, so one of the drives didn’t come back. Rescanned the SCSI bus with rescan-scsi-bus.sh (part of scsitools package in Debian/Ubuntu). Still nothing, next step is to unplug/replug the missing disk but don’t want to unplug a present device. Fortunately, most manufacturers will place the serial number on the top surface of the drive (in addition to the front), these Western Digital RE3 drives are no exception, so was able to figure which disk is missing without having to pull out working disks, by grabbing the serial numbers of all the present disks with “smartctl -i” and comparing with the number listed on the top surface on the physical device (UPDATE: instead of tedious manual comparison, use this method to get the serial number of the failed disk). Unplugged the missing disk, plugged back in. Re-scanned the SCSI bus again and, voilà, drive showed up as /dev/sdm.

Ok, time to fix the array. Here's the status:

cat /proc/mdstat
md1 : active raid10 sdl1[5] sdk1[6](F) sdc1[2] sda1[0] sdd1[3] sdb1[1]
      1465151808 blocks 64K chunks 2 near-copies [6/5] [UUUU_U]
mdadm -D /dev/md1
/dev/md1:
        Version : 00.90
  Creation Time : Wed Dec  8 15:10:08 2010
     Raid Level : raid10
     Array Size : 1465151808 (1397.28 GiB 1500.32 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Feb  4 16:32:43 2011
          State : clean, degraded
 Active Devices : 5
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 0

         Layout : near=2, far=1
     Chunk Size : 64K

           UUID : a60b30f2:741aa1ab:e368bf24:bd0fce41
         Events : 0.11870

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1
       4       0        0        4      removed
       5       8      177        5      active sync   /dev/sdl1

       6       8      161        -      faulty spare

Let's add /dev/sdm back to the array:

mdadm /dev/md1 -a /dev/sdm1
mdadm: re-added /dev/sdm1
cat /proc/mdstat 
md1 : active raid10 sdm1[6] sdl1[5] sdk1[7](F) sdc1[2] sda1[0] sdd1[3] sdb1[1]
      1465151808 blocks 64K chunks 2 near-copies [6/5] [UUUU_U]
      [>....................]  recovery =  0.4% (2074816/488383936) finish=159.9min speed=50679K/sec

Great, the array is rebuilding, but the old device still shows up as faulty:

mdadm -D /dev/md1
/dev/md1:
        Version : 00.90
  Creation Time : Wed Dec  8 15:10:08 2010
     Raid Level : raid10
     Array Size : 1465151808 (1397.28 GiB 1500.32 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 6
  Total Devices : 7
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Feb  4 16:42:45 2011
          State : clean, degraded, recovering
 Active Devices : 5
Working Devices : 6
 Failed Devices : 1
  Spare Devices : 1

         Layout : near=2, far=1
     Chunk Size : 64K

 Rebuild Status : 2% complete

           UUID : a60b30f2:741aa1ab:e368bf24:bd0fce41
         Events : 0.12252

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1
       6       8      193        4      spare rebuilding   /dev/sdm1
       5       8      177        5      active sync   /dev/sdl1

       7       8      161        -      faulty spare

Let's get rid of it:

mdadm /dev/md1 -r faulty
mdadm: hot removed 8:161
mdadm -D /dev/md1
/dev/md1:
        Version : 00.90
  Creation Time : Wed Dec  8 15:10:08 2010
     Raid Level : raid10
     Array Size : 1465151808 (1397.28 GiB 1500.32 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Fri Feb  4 16:43:12 2011
          State : clean, degraded, recovering
 Active Devices : 5
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 1

         Layout : near=2, far=1
     Chunk Size : 64K

 Rebuild Status : 2% complete

           UUID : a60b30f2:741aa1ab:e368bf24:bd0fce41
         Events : 0.12273

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1
       6       8      193        4      spare rebuilding   /dev/sdm1
       5       8      177        5      active sync   /dev/sdl1

Looks good. Now just let it rebuild. Meanwhile, can keep working. One of the things I like about RAID10, is that in addition to better resiliency and performance than RAID5 or RAID6 in optimal mode, RAID10 also performs better in degraded mode.

The only knock against it is that Linux can't currently boot from it and that half the disks are lost to redundancy. Being able to loose up to half of the array has already saved me a couple of times, though, so I'm not complaining. To solve the /boot partition limitation, recently started placing the boot partition on a USB flash drive, so all the hard disk drives can be dedicated to RAID10.

Perhaps once solid state disks come down in price, I'll switch to those and if they're sufficiently more reliable than spinning disks, maybe consider a switch to RAID5 or RAID6. Should probably also upgrade my CPUs though since they'll have to spend some of their time on parity calculations. Right now, all three of my home computers are running fairly ancient dual core Pentium D processors.

2 Comments

  • 1. mgs replies at 30th April 2011, 4:36 pm :

    The only knock against it is that Linux can’t currently boot from it and that half the disks are lost to redundancy.

    I believe Linux (at least Debian Squeeze) boots just fine from RAID10. I just finished migrating my system from RAID1 to RAID10, but since this was an exisiting system, it might just be that the installer does not support this ATM.

  • 2. Marc Warne replies at 14th March 2012, 8:32 am :

    It’s more that Grub (at least grub-legacy, i.e. < 1.00) can't read kernel images/initrds/menu.lst from a RAID-10 set but can from a RAID-1 set.

    The solution for me is simply to have a small RAID-1 /boot partition across all drives and then RAID-10 the rest of the disks.

    Marc
    GigaTux

Leave a comment

NOTE: Enclose quotes in <blockquote></blockquote>. Enclose code in <pre lang="LANG"></pre> (where LANG is one of these).