Activity › Forums › Storage & Archiving › what to get with 20K
-
Bruce Little
January 7, 2011 at 5:36 amHi Bob,
Thanks for all the great info, nice list of clients too!and Yes! the price is appealing for 16TB, because i’m sure 8TB will not be sufficient… but I also need to consider that there are a couple of AVID’s and a couple of FCP systems that will need to be on this system.
-
Eric Cox
January 8, 2011 at 6:46 amDon’t do RAID 5 on multi-terabyte volumes. Even if the array has hot spares available this is asking for issues. The problem being that the potential for a misread while rebuilding from parity is now so great that rebuilds can fail. The read error rate has remained constant, but we have scaled disk sizes up dramatically from the days of raid 5’s inception.
At this stage nearly everyone has a RAID 6 implementation or something similar based on some proprietary protocol. Sure, you lose another disk worth of space to parity, but the storage is still superior to RAID 10. Even in degraded mode when data is being rebuilt on the fly is not nearly the hit it was due to the additional parity stripe.
-
Steve Modica
January 8, 2011 at 6:57 pm[Eric Cox] “Don’t do RAID 5 on multi-terabyte volumes. Even if the array has hot spares available this is asking for issues. The problem being that the potential for a misread while rebuilding from parity is now so great that rebuilds can fail. The read error rate has remained constant, but we have scaled disk sizes up dramatically from the days of raid 5’s inception.”
The choice depends on the drives you use. Typical SATA drives (even ES drives) have an error rate of 1 in 10^14. SAS drives are lower density and offer 10^16. The SATA drives we use now are 10^15 blocks. That works out to about 1 read error in 120TB. I draw the line at around 10 drives to go to RAID6.
Steve Modica
CTO, Small Tree Communications -
Steve Modica
January 8, 2011 at 7:38 pm[Bruce Little] “How about RAID 50?”
Any of the 0 stripe setups are problematic.
With two RAID5s, if either has a problem, you are performance limited. So you double your chances for problems. (or at least performance limited periods)Secondly, the striping drivers that are out there all suck. They give each stripe 256k or 512k and that’s very little relative to what the raids really want. They do best with larger IOs. In Small Tree’s testing, any striping makes lantency much worse.
Steve
Steve Modica
CTO, Small Tree Communications -
Eric Cox
January 8, 2011 at 8:53 pmLatency is going to suffer with larger stripe sizes. The is simply because the head has to read the full stripe before moving onto the next batch of sectors. Large stripes are typically used where you need increased throughput on large linear reads.
If you need increased latency, which typically benefits smaller non-sequential reads, you would opt for a lower stripe size.
Ideally, if you know you will be doing large sequential reads with a large stripe size it helps to align the file system block size. With read ahead and a large cache it’s not such a big deal, but you can eek out some nice performance if you plan accordingly and known the data content.
-
Steve Modica
January 8, 2011 at 9:59 pm[Eric Cox] “Latency is going to suffer with larger stripe sizes. The is simply because the head has to read the full stripe before moving onto the next batch of sectors. Large stripes are typically used where you need increased throughput on large linear reads.”
I’m going to respectfully disagree in our context of video editing. The applications all issue 4MB reads. The days of reading a frame aligned IO are gone. The App is just sucking in chunks as fast as it can. Further, the OS/Filesystem code is aggregating those IOs. For example, if you issue a number of 1MB linear IOs, the filesystem code will aggregate those up to 32MB (in 10.6).
What this means in practice is you want a full stripe size that matches that. Otherwise, you increase the number of stripes you need to read to get the data off (latency ends up going up with overhead, not down).
The notion of a stripe at 128k or 512k is just stupid at these IO sizes, which is why striping stinks so bad.I believe most of the modern raid controllers will also service small IOs with a partial stripe read. (This is a guess, but I know they don’t all check parity on every IO either).
To summarize, with the large, aligned IOs the apps are issuing, bigger is usually better.
Steve Modica
CTO, Small Tree Communications -
Bob Zelin
January 9, 2011 at 4:27 pmEric Cox writes –
Don’t do RAID 5 on multi-terabyte volumesEric,
where do you get this nonsense from? RAID 6 makes sense only for drive arrays with 16 or more drives. A typical small user will buy an 8 bay array, because they want to save money. An 8TB array becomes 6.5TB after RAID 5, and the performance slows down from RAID 0. Now, make that RAID 6, and you get even LESS storage and less performance (throughput) – yes, you have your protection, but at what cost to the user.Most manufacturers preconfigure their arrays (even 16 bay arrays) to RAID 5 from the factory (think Sonnet, Active Storage, Cal Digit, JMR, Maxx Digital, Small Tree, etc.). RAID 6 is a great idea if you have no one there to ever administrate the system, but again, you lose performance, and total available size of the array.
So, what is the best “modern” configuration (which will be outdated in 3 months) – the configuration that JMR uses – a split buss 16 bay chassis, split into two 8 buss groups, each with their own SAS host adaptor card SET UP FOR RAID 5, stripped into a RAID50 config. This is how you get 1200MB/sec – enough for your uncompressed 4K work. But once 6 Gig drives with 6 gig host adaptors and 6 gig backplanes become commonplace (sometime in 2011), then EVERYONE will be getting close to 2000MB/sec, and all of this will be moot.
Everyone is a big shot on these forums, but the reality is that most people (most clients) have NO MONEY, and are trying to do things as cheaply as possible – (read: the most storage for the least amount of money, with the best performance) – and NO ONE that I know would accept an 8 bay chassis configured in RAID 6, no matter how “safe” it was.
And even with RAID 6 – if you don’t occationally check your server, to see if you have a failed drive (don’t even start with me about you should have the host adaptor email your SMTP mail server with the failure notice – what planet do you live on ?) – then even with RAID 6 and a hot spare ready to go (in your 8 bay), then you will STILL lose all your data, because if you are lazy, and never bother to look, then THREE drives will fail, and you are still screwed.
I just had a client lose all his data on an 8 bay – he saw the RED light flashing on the array, and the ATTO SAS EVEN HAS OCCURED – DRIVE 2 HAS FAILED – but did NOTHING ABOUT IT (what is your explanation about that Eric ? ) – and then had ANOTHER DRIVE FAIL (it was RAID5), so they lost all their data, and only called after the second drive failed. Didnt’ they notice the red flashing light on the array, indicating a failure – dont’ you think they might have called me to ask “gee, what’s that red light mean” (yes, they were trained on all of this 2 years ago) –
Don’t live in fantasy land. If you do this for a living, you know exactly how clients behave, and what they expect. And professionalism is not one of them.
Bob Zelin
-
Eric Cox
January 9, 2011 at 8:24 pmI deal with large farms many many times larger with systems that have to perform and where data loss means more money then the rack costs. I do however have a few friends who are quite on a budget with their production shops. While I moved on from that engineering work a while ago I still work gigs for them in my spare time. Sadly, I know the pain you are talking about.
The problem with being on a budget isn’t the budget itself. You can come up with scenarios all day to help someone. However, it’s when being “cheap” causing data loss that the customer has to pay to recover from. (If possible).
Funny how you mention someone ignoring the drive light. I had a group who had ignored the drive failure light on the host. However, I had enabled the audible alarm on the controller and of course it went out blaring. They managed to actually fire up the software and disable the alarm. Completely ignoring the disk failure issue! The system actually emailed out the failure as well, but the technician they had previously was no longer working there. To add even more insult to injury the disk that was in the hot spare bay was removed and presumably re-purposed to someones home.
Surprisingly, I did recover the array and start the rebuild, but it was more luck as the second failed disk came back online.
As someone who likes to live in fantasy land I will always preach data protection. If someone wants to go the cheap route then the loss is ultimately upon their head. However, I don’t believe I can truly express how fragile drives tend to be. It’s not some evangelical crusade either. I have been there when the data loss and recovery was on my head. It’s not a good feeling and it’s probably worse for the customer.
-
Bob Zelin
January 9, 2011 at 11:44 pmEric writes –
Funny how you mention someone ignoring the drive light. I had a group who had ignored the drive failure light on the host. However, I had enabled the audible alarm on the controller and of course it went out blaring. They managed to actually fire up the software and disable the alarm. Completely ignoring the disk failure issue! The system actually emailed out the failure as well, but the technician they had previously was no longer working there. To add even more insult to injury the disk that was in the hot spare bay was removed and presumably re-purposed to someones home.REPLY – now you know why I made a hysterical reply to you. What you have just described is my typical client. Many “pros” who are in hi end installations, simply cannot understand (or believe) what you have just described. I have trouble conveying info like this to ATTO and Areca.
Bob Zelin
Reply to this Discussion! Login or Sign Up