Thursday, November 01, 2007

Problem with WD drive dropping out of RAID set

I have four WD5000 drives in a RAID configuration using the ICH7R chipset. Generally these drives have been performing fairly well, but occasionally I've had the odd one just drop out of the RAID set. The first time it happened of course, I was thinking I've landed my self a bum drive, but I was simply able to bring it back into the set with a simple reboot or just marking it as normal in the Intel Storage Matrix manager. Then I would have to wait 12 hours for the set to rebuild while it operated with degraded performance. After this happened for the third time I started getting a little worried, and then last night I have two of the drives drop off. If you know anything about RAID5 you know this means you have now lost access to your data! Arrhhhhhggggg! Fortunately I have been fairly diligent about backing this machine up so everything is safe. Although this could have wasted many hours this weekend.

Anyway when I went into the Storage manager on boot up, I was able to recover one of the drives, and then was able to bring the other one back in the windows intel storage manager. However as I was looking at the specs of the drive in the storage manager, I noticed that the firmware versions were different. So I did a search for my model of drive, and wham! The first hit was a technical bulletin from intel talking about how WD5000 drives can spontaneously drop out of a raid set on an Intel NAS. Interesting I thought. I then linked to the WD site and low and behold this is a problem. The fix is to update the firmware to the version that only half my drives have.

So I took the drives out of the RAID set in the BIOS and then booted up using UBCD. I ran the wd5000ys.exe utility as recommended and waited for the response saying the firmware had been upgraded. I waited....and waited.....and waited.....and realizing that even with 4 drives 20 min wait was more than enough time to give it. In fact it shouldn't take more than about 2mins per drive. I rebooted and tried it again, running in the 32bit command shell and the 16 bit windows dos command shell (run using command.com). Still no success. So rebooted and popped by drives back into the RAID set. Everything came back up, but of course my drives are still using the old firmware. So my next step is to maybe try a bootable DOS disk. If this is the case, then I can't believe that in this day and age WD are still requiring the use of damn bootable DOS diskette!

No comments: