Creative Communities of the World Forums

The peer to peer support community for media production professionals.

Activity Forums Storage & Archiving RAID 5 Drive failure

  • RAID 5 Drive failure

    Posted by Rex Polanis on November 21, 2013 at 8:37 pm

    I have a RAID 5 and a drive has failed. I can not acces files from the array. It shows up in “my computer” but I can only access some files.

    This is my older system but I have photos that I need to acces for a project.

    System:
    Asus M2n Sli-Deluxe
    AMD Athlon x4 640 @ 3.01Ghz
    8gb DDR 2 RAM
    128 SSD
    3 500gb WD Drives set as 1tb RAID 5
    Windows 7 64

    Can someone please help me. Thank you

    One man with courage makes a majority.

    Canon 7D
    Adobe CC Master Suite
    Digital Juice
    Video Copilot

    Chris Murphy replied 12 years, 5 months ago 4 Members · 4 Replies
  • 4 Replies
  • Alex Gerulaitis

    November 21, 2013 at 9:49 pm

    The very 1st thing I’d do is attempt to clone the volume, e.g. using Casper or some other block-level process, and keep a log of the process to see if you get any read errors, and how many.

    The fact that you can read some files gives hope.

    The problem can be caused by file system corruption (not RAID related), RAID engine failure, and/or drive failure. That’d be the 2nd thing to figure out, and I am afraid, you will need expert help on that.

  • Rex Polanis

    November 27, 2013 at 4:48 pm

    I fixed it.

    This is what I did:
    During the boot up process I pressed F10
    this brought up the built in Nvidia RAID control utilitiy
    Once there the screen showed 2 drives in the RAID
    one drive bad and another drive degraded
    I selected the bad drive and pressed enter for information
    the utility showed the bad drive was on esata port 2.1
    I went back and checked the degraded drives (meaning healthy but not active) and the nvidia utility showed the degraded drives where on esata ports 3.0 and 3.1

    with this information I was able to figure out which drive was bad.
    I powered down, removed the bad drive and replaced it with a new drive of the same make, model, and size.

    I rebooted and hit F10 again
    the Nvidia utility now saw one RAID drive with the option to Rebuild.
    I hit “r” to rebuild the selected RAID and left the computer on, in that utility, over night.

    The next day I exited the Nvidia utility and rebooted straight to windows 7 and all my information was restored!

    Hooray!

    One man with courage makes a majority.

    Canon 7D
    Adobe CC Master Suite
    Digital Juice
    Video Copilot

  • Ericbowen

    November 27, 2013 at 6:51 pm

    I would run a parity check now or move data off and rebuild the volume. Had the Raid volume info been clean, you should have seen your data with just 1 bad drive. I would not trust that raid volume data yet.

    Eric-ADK
    Tech Manager
    support@adkvideoediting.com

  • Chris Murphy

    December 5, 2013 at 8:00 am

    I agree. Scrub (parity check) is a minimum requirement. It’s a needle in a hay stack, but it’s typical for drives to spit garbage as they die and neither RAID nor the file system have any means of disputing the garbage. That garbage just causes confusion and strange OS behavior, including file system problems. The other thing that’s possible is one or more of the surviving drives has bad sectors resulting in transient read failure. That causes delays as data is being rebuilt from parity (on the fly) in the array’s degraded state. Upon rebooting to the RAID firmware interface, initiating the rebuild on a replacement drive, the remaining drives are permitted to take quite a while (30 seconds, maybe more) to make multiple attempts at reading these transient sectors. The thing is, without an explicit read failure those transient sectors aren’t fixed (or replaced). And if they did produce a read error, that’d mean their data isn’t returned which means a collapsed RAID 5 array. So… yeah. I wouldn’t trust it.

    And that’s why buying cheap drives not designed for use with RAID is shooting yourself in the foot. They’re explicitly designed to have long error recovery times, instead of producing an error quickly, thereby causing the RAID to fix that sector, so they don’t accumulate. Which they do with the wrong kind of drives.

We use anonymous cookies to give you the best experience we can.
Our Privacy policy | GDPR Policy