Well, the point that software raid only protects against a disk failure and not a power failure was driven home last night. I run a 3 disk raid 5 on a 2.6 kernel with the raidtools package. The machine suffered a power supply failure and when I brought it back up, recovery was needed.
The RAID 5 is hdf1 hdg1 and hdh1. The system complained about hdg1 not being "fresh" and so I told it to rebuild. This process started fine and so I went to sleep.
Next morning the server is stuck trying to rebuild, and looking back at the logs shows it encountered an I/O error in hdh1 and kicked it out of the RAID, during the recovery process.
So, now I have a RAID 5 with 1 known good disk, one with a failed resync, and one that still acts ok, except what looks like a bad block. Before I go off and run this lengthy rebuild tool from
here, does anyone else know a way to just have the system bring the RAID 5 back online in a read only mode ignoring a block error or two?
I found some references to ckraid and some scanning/recovery switches it used to have. Newer versions got rid of this though in favor of moving automatic recovery in the kernel, leaving me with no way to manually try it seems.
The critical data from that RAID is backed up thankfully. However, that drive also contained my MP3 collection with no backup, so I'd much rather try for a recovery solution over reripping the music I just reripped 4 months ago.