RAID 6, With or Without Hot Spare

Storage & Archiving

RAID 6, With or Without Hot Spare

Posted by Kevin Patrick on January 5, 2014 at 2:50 pm

Being paranoid (inexperienced and uneducated), I plan on setting up my Pegasus2 R8 as RAID 6.

My question is whether or not I should also designate a hot spare drive in the array.

If I have to make this decision on my own (meaning me and the internet), I’d say yes. Designate a hot spare.

That would be the safe way to go.

Right?

Chris Murphy replied 12 years, 4 months ago 4 Members · 20 Replies
20 Replies

Alex Gerulaitis
January 5, 2014 at 9:52 pm

Not necessarily. Having a cold spare on a shelf is more efficient if you don’t run your array 24/7 and assuming you’ll know immediately that a drive failed.

Hot spare means that (a) it’s just sitting there doing nothing until a drive fails, (b) your capacity, efficiency are down by 1/8th, something to consider given that RAID6 already takes away 2/8th of the capacity, (c) auto-rebuild that may bring performance down quite a bit, possibly for more than a day.

I.e. hot spares make more sense on larger arrays with RAID6.

My personal preference in the event a drive is marked as failed on smaller arrays (less than 16 spindles) is to check that the drive indeed failed rather than just “marked as failed” because of a timeout. Quite often the drive is actually healthy, just hiccupped, and can be marked as “online” w/o side effects. (A rebuild may still be necessary.)

I also prefer to start and monitor a rebuild manually on smaller arrays with non-24/7 duty. Auto-rebuild in the middle of a project may not be a good idea.

If it’s uptime and data protection you’re looking for: RAID6 with a cold spare or two sitting on a shelf, and backups.

— Alex Gerulaitis | Systems Engineer | DV411 – Los Angeles, CA
Chris Murphy
January 6, 2014 at 5:34 am

What’s this for? Video production work or as a backup? What drives?

The raid6 RMW penalty is significantly worse than raid5, plus the hot spare means you’re at best 5x read speeds, which depending on the drives will vary alot unless they are short stroked. Once even one disk has failed, the degraded mode performance I’m very skeptical will supply what you need to actively use the array while the rebuild is occurring. ISo I’m not sure what the extra drive redundancy gets you. In any case there should be sufficient regular backups that you shouldn’t need to depend on raid6. If you’re going to forego regular backups and think raid6 bridge that gap, I think that’s a mistake.

As for hot spare, forget it. It costs you both performance and capacity and isn’t worth it. Before raid6 I’d consider raid10. Before raid6+hotspare I’d consider raid15. But before all of that, make sure you’re doing regular scrubs on the array, use drives that have proper error timeouts so their errors are actually corrected during scrubs, have one or two same model drives on hand and properly replace them when the time comes. If you don’t do those things you can still get bit in the ass with raid6 and a hot spare. And if you don’t need the extra capacity this layout implies, get an R6.
Alex Gerulaitis
January 6, 2014 at 6:16 am

[Chris Murphy] “plus the hot spare means you’re at best 5x read speeds”

You sure it’s not 7x Chris?

(I understand real-world numbers support your thesis, yet theoretically 6 has the same read performance as 0. Perhaps it’s the implementation that brings the performance down, not RAID level.)
Chris Murphy
January 6, 2014 at 6:42 am

Yes, because eight drives in raid6 with a hot spare means stripe members (the ones we get data chunks from during full stripe reads) is five. Parity chunks do not count, and that’s 2 “drives” worth. And the 3rd is the unused hot spare.
Alex Gerulaitis
January 6, 2014 at 7:05 am

Makes sense – thank you.
Chris Murphy
January 6, 2014 at 7:06 am

So yes you’re right raid5 and raid6 and raid0 have no read penalty, but to compute the performance factor you only count data drives. A five drive raid0 has five data drives so that’s 5x reads. A six drive raid5 also has five data drives (yes it’s distributed parity but for any full stripe read one of the drives produces a parity chunk which is not read during normal operation). And a 7 drive raid6 likewise has five data drives. So for reads, those raids should perform the same all other things being equal. They each have equal number of data drives, but unequal total drives.
Kevin Patrick
January 7, 2014 at 3:34 pm

Chris,

It sounds like you’re suggesting RAID 5, no hot spare. Correct?

In my limited RAID experiences, I’ve never had a hot spare. But I thought a hot spare would be better than a cold spare. Aren’t HDDs happier when they’re spinning? They’re designed to spin, not sit, right?

Although, I think I understand the reasoning behind not having a hot spare, since you can manually control when and if the array is rebuilt. (as Alex points out)

Still, wouldn’t it be better for the drive to be sitting in the array, always up and running? I could use still use that single HDD for other purposes. (maybe cloned backups of my boot drive?) If a drive fails, shouldn’t I be able to turn that extra HDD into the replacement HDD?

Wouldn’t RAID 5 (7 drives) plus scratch drive (to be used as spare if/when needed) be kind of like RAID 6, but with a benefit of using the non-Array drive for other purposes until it’s needed to rebuild. Plus, I’d be running at RAID 5 performance the entire time.

Also, why RAID 10 over RAID 6? RAID 0+1 or 1+0?

My understanding is that RAID 10 can recover from 1 drive failure per span. With an 8 drive array, that would be two spans, so it could recover if you lost one drive from each, two drive failure. But I guess you could loose more drives on one span and still recover, assuming you lost no drives on the other span. Correct?

Is RAID 10 faster (write) than the overhead required for RAID 6, with 8 drives and no spares?

Kevin
Ericbowen
January 7, 2014 at 11:46 pm

Raid 6 is far more reliable than raid 5 because the controller has 2 levels of parity to verify to. The problem with current drives failing over time is parity information can corrupt on 1 drive over time causing considerable damage to the raid integrity. It is not uncommon to lose a raid 5 array to corruption because of this. However raid 6 having that extra level of parity protects against that provided you have the parity verification scheduled periodically. It is extremely rare to have 2 discs corrupting over time especially in the same parity data regions. I have never seen a raid 6 completely unravel with NTFS as the file system. The worst I have seen on a raid 6 is a few files corrupted at most. Also the performance difference between a 8 Drive raid 5 and 8 drive raid 6 is 720MB/s for the R5 to 650MB/s for the raid 6. Not nearly enough to justify the greater chance in failure.

Raid 10 is block level mirroring and far more expensive than Raid 6 in amount of disks used. You can only verify the data of 1 mirror to another versus 2 levels of parity so verification is not nearly as good as Raid 6. The performance on 8 drive raid 10 would still be around the 8 drive raid 6 so there is no gain there. I do not suggest raid 10 unless dealing with massive disk arrays where failure rate is much larger over all and needs to be reduced to mirrored partner level percentages.

Eric-ADK
Tech Manager
support@adkvideoediting.com
Chris Murphy
January 8, 2014 at 8:00 am

I’m not recommending raid5, maybe raid6 is a good fit. But you asked an open ended question without any detail on the use case, work load, backup strategy, or the drives being used. So I’m just poking holes. Maybe raid6 fits your use case exactly right.

RAID6+hotspare is a red flag to me, that says, “this data is really super important and needs to always be available”. And it necessarily implies the workload can still get by on degraded performance, or you can tolerate waiting for the rebuild. Video production workloads are demanding as is parity array rebuild, so I’m skeptical that you can do both in a 1x fail raid6. But I have to defer to more experienced people on this question. For a 2x fail raid6, the performance must obviate doing any meaningful work.

Let’s look at the rebuild times. The Promise web site says “Pegasus2 R8 32TB model is populated with 5900 RPM SATA drives” which is why I asked what drives you’re using. Those are probably Seagate Barracuda XTs, which average about 130MB/s on sequential writes meaning it will take 30 hours to fully write. Raid1/10 can rebuild at the drive’s max sustained write speed. Parity raid will take longer. How much longer depends on the controller.

So there are three questions: Is the 1x degraded raid6 performance still decent enough to get work done? If not, can you tolerate the downtime for the rebuild? And how much longer than 30 hours is it?

There are all sorts of ways to resolve this. The extremes are getting 10K-15K SAS drives of smaller capacities so the rebuild times are shorter and the performance is better, except the Promise spec sheet says drive support is SATA, not SAS, so that’s maybe a dead end. The other is raid10 which gets you a much smaller performance hit in degraded mode, with even a 3x drive failure, let alone the 1x failure you’re most likely to encounter, the rebuild time is also faster.

Also, raid6+hot spare means you’re setting aside 37.5% of your storage capacity. And another 12.5% hit isn’t much compared to the gain of raid10 over raid6+hot spare.

With regard to raid10 vs 0+1: Use raid10 over 0+1 because it always rebuilds faster, and you can lose more drives than 0+1. Their performance in normal mode is the same.

You can relax your concern whether drives are better off spinning or on a shelf. Consider the manufacturer themselves will have a pile of a given drive model in reserve for years. I don’t consider raid5+hot spare to be raid6. And I’m unaware of an implementation that permits a hot spare to be a working rw mount that is suddenly yanked from the user without warning, destroyed, and used in a rebuild upon a single disk failure calling it to duty.
Chris Murphy
January 8, 2014 at 9:14 am

If silent data corruption is a real problem needing mitigation, then we can’t consider non-checksumming parity raid6 qualified to deal with that. That realm is for ZFS, Btrfs, ReFS, and PI.

Parity isn’t a checksum, so any disagreement between a data and parity chunk, even when two parity chunks agree in a mismatch with a data chunk, is still ambiguous. Hence the “write hole” applies to raid6 every bit as much as to raid5 and raid1. It’s an assumption to defer to two agreeing Q and P chunks against a data chunk, and in fact this is a wrong strategy because this very thing can happen in a power failure where data chunks were correctly written but parity chunks were not, they have their old values and therefore still agree.

Further, in normal operation, parity chunks aren’t even consulted. So the system doesn’t know about any mismatches. That’s why regular scrubs are important. Most cases of raid5 total collapse despite only a single drive failure, is due to wrong setups. The wrong drives were spec’d, scrubs weren’t scheduled, the drive and controller timeouts weren’t set correctly. Bad sectors end up not being fixed. A drive fails, rebuild commences, and a bad sector is encountered, but since we’re degraded there’s no parity to rebuild from and the array collapses even though there’s only been a single drive failure. So the way around this is go with raid6 rather than fix the underlying sources of the problem.

Now, there’s no question that drive sizes are growing more quickly than drive performance. Therefore rebuild times are going up a ton and that’s why it’s sane to recommend raid6, because a 2nd drive could die during rebuild. But corruption mitigation isn’t the use case. Raid is intended to defer the checksumming/corruption mitigation to the hardware, the drive does actually write checksums to each sector, and its ECC is designed to detect and correct problems. If it can’t, it should report a read error and then the raid can do something about that in normal operation.

Page 1 of 2

1 2 →

Reply to this Discussion! Login or Sign Up

Creative Communities of the World Forums