Chris Murphy
Forum Replies Created
-
Chris Murphy
August 19, 2013 at 3:35 am in reply to: External hard drive(s) causing Kernel Panic since RAID setup.The journal makes it possible to mount an unclean file system faster than doing a full fsck, not data recovery. On large volumes, or with many files, an fsck on a non-journaled file system can take a long time, even hours, and gobs of memory. The journal has no possible relationship with application problems, they aren’t even aware of the journal being enabled or not. There is perhaps some advantage to disabling journaling on SSDs, which can repair much faster in the event of an abrupt unclean unmount, and also the journal writing a lot of metadata will cause additional SSD wear. But if this is for video, there isn’t much metadata anyway. The file sizes are huge, and quantity few, therefore not much metadata.
-
Yeah, about the green drives and raid thing. I don’t know that anyone said you can’t do it. Just that use in raid isn’t recommended, including by WDC. In fact WDC says these drives are for secondary usage, implying they don’t recommend them for boot drives either. I don’t see the point in arguing with a manufacturer who is basically saying in a marketing data sheet “we really don’t want your money for your intended use case.”
Further, it’s just a matter of time before there will be problems with these drives. Forums everywhere are full of such stories of raid5’s collapsing when green drives are used. The common sequence is: one drive dies or takes too long in error recovery for the controller, controller kicks out the drive or resets the bus, user replaces the bad drive (which may or may not be bad) and then all it takes is a single bad sector to cause either another kicked drive, bus reset, or an actual sector read error. In all of those scenarios the raid5 rebuild halts, and it’s no longer merely degraded it has collapsed. So then people freak out because only one drive died and this isn’t supposed to happen, blah blah blah.
The real problem with the drive is that the ERC is too long, and it can’t be configured with any of the recent Greens. If you want to play with fire, set the controller error time out so that it’s at least 121 seconds to give the drive enough time to actually report a read error, so that the bad sector is repaired by the raid controller. And also do regular scrubs.
Of course, in the meantime, your application must be able to gracefully contend with up to 2 minute hangs while the drive sorts out whether or not the data on that sector can be read or recovered. Many applications get pretty pissy (let alone the user) when there’s an IO delay of 30 seconds, let alone 2 minutes.
And it’s not like it’s a whole lot better on the Seagate consumer side, where they now have in their marketing spec sheet under Reliablity, a 2400 hour power-on spec. That’s 100 days at 24×7. A Google or Amazon, if they were even to use such a drive, would bust through that spec on day 101, and exceed it by a factor of 7 before the warranty was up.
There’s no good reason for these companies to honor warranties at all when drives are used in situations that are plainly proscribed.
-
The symptoms you describe could be network or drive related. While you have good network hardware, the cabling quality is unknown and is a top source of network problems at even much lower speeds than 10GigE which is even more finicky about not following the IEEE rules on cable lengths, bend radii, proximity to ballasts, and other electrical equipment, pinching, pressure (think stapling cables to a wall or a table length squishing one, that’s BAD). You should be able to isolate this by putting a single USB drive on the server and just doing some basic (large) file copies with curl or rync, while checking the performance in iotop or equivalent to see if you’re getting source read stalls from the array. Or if only happens over the network. If it’s sourced at the array then you’ll need to figure out why that’s happening, and it wouldn’t surprise me if it’s bad sectors on new green drives.
WDC explicitly proscribes the use of Green, Blue, and Black drives in anything other than raid1 or raid0. The Red’s are proscribed in arrays comprised of more than 4 disks. The RE’s are recommended for 5+ drive arrays.
-
Chris Murphy
August 18, 2013 at 8:12 pm in reply to: External hard drive(s) causing Kernel Panic since RAID setup.Even if true, the read-write head is designed such that writes are immediately followed by reads, and the firmware confirms/denies the write. If it persistently fails for a sector, the LBAs are reassigned to a reserve sector(s) and the bad sector is removed from use. It can’t even be accessed by software anymore as it no longer has an LBA. Writing zero’s will also do this. But it’s not really necessary in advance.
The bigger issue for large raid arrays is the lack of scheduled scrubbing, to make sure rarely read/written sectors aren’t going bad. The typical case of raid5 array failure is lack of scrubbing, a handful of bad sectors are developing on one or more drives, and a single drive failure occurs. During rebuild, one of those bad sectors is encountered as a persistent read failure and the rebuild fails and thus the array fails. It’s also exacerbated by using consumer drives with very long error correction timeouts.
Before disposing of a drive, it’s probably a good idea to “clear” or “purge” any non-raid, or raid1 drives; just like it’s a good idea for photographers to tear their reject prints before disposing of them. “Clear” is the minimum kind of erase, which is what Apple Disk Utility uses. To constitute “purge” one must use the firmware’s ATA Secure Erase, Sanitize or Crypto Erase commands. Even though ATA Secure Erase has been in ATA drives since 2000, Apple sadly doesn’t support it in Disk Utility, and is the only way to sanitize data that exists in bad blocks removed from use that still contain data. The added thoroughness is perhaps a small benefit, but using ATA commands to have the firmware perform the operation is much faster than writing zeros sector by sector by software. And it also entirely frees the bus from write operations. Further annoying is that most USB bridge chipsets don’t support the passthrough needed for these commands to work so typically it’s only available with a SATA or eSATA connection.
-
Chris Murphy
August 18, 2013 at 7:32 pm in reply to: External hard drive(s) causing Kernel Panic since RAID setup.Zeroing the drives can’t possibly be related to the problem. It may be pointless, but it’s also benign. He had over ~750 million successful write operations doing that zeroing, and only once the raid1 was created did he start getting kp’s. So I think the problem is elsewhere.
If he can get one drive into either the existing docking station, or an alternate enclosure, without getting a kp; or booting a totally different kernel that doesn’t panic with this hardware, it’s possible to remove the Apple raid1 metadata from both drives, and then see if the kp is consistently reproducible only when the drives are software raid1’d together. That’s much faster than returning the drives. And to save time he needs to know if this is induced by software raid1. I think it’s some obscure incompatibility between software raid1 and this docking station, but there isn’t enough information to know that yet.
-
Chris Murphy
August 18, 2013 at 7:25 pm in reply to: External hard drive(s) causing Kernel Panic since RAID setup.So the kernel panic only happened after creating a software raid1 array? And the drive erase operation that took three days was successful for both drives?
Certainly any kp is a kernel bug, but it can be triggered by wrong hardware behaviors. Yet ~750 million write operations were successful during the erase. Can you post the kernel panic to pastebin and post the URL in the forum?
It’s a good idea to allow the system to report the kernel panic to Apple, and in the comment section including the make/model and firmware revision of the docking station, and the condition that the kp’s started to happen only after two drives were software raid1’d together. That could be coincidence, the only way to test this would be to work with drives that don’t have Apple raid metadata on them, and see if they can be used independently for a while. Then recreate the software raid1 and see if you get kp’s again.
Do you get a kp when only one of the drives is inserted into the docking station? If not, the Apple raid metadata can be remove from OS X using dd. If it does still kp, then either you’ll need to put the drive into a alternate enclosure which hopefully doesn’t trigger the kp. Or boot the iMac from a Linux live cd/dvd and use dd there to wipe the raid metadata off the drives. So decide which is easier: downloading and booting from a linux live cd/dvd or swapping enclosures. And then once you confirm you’re not getting kp’s when connecting a single drive post the result from this command:
diskutil list
And I’ll get you a suitable dd command to blow away the Apple software raid metadata.
-
It’s an annoying debate style to throw rocks, claim a miss is a hit, and then duck for cover so you can’t be asked questions any more.
I didn’t add more assumptions. I did support the statement with a more detailed explanation. I also refuted your assumptions about equal RAID 5 vs RAID 0 behavior in the face of a URE, a significant factor supporting my statement, as well as refuting the words you put in my mouth that I assumed the array would stay degraded. (My argument is weakened if it’s allowed to stay degraded, the likelihood of collapse increases as the array is rebuilt).
You’ve essentially asked for a study that supports 2+2=4, or that “the more drives added to a RAID 0 array, the higher the probability the array will collapse.” For one such a study would be expensive and take a long time, real drives of identical model from the same batch are not in fact identical enough for repeatable experimentation. You’d end up with noisy data, and arrive at inaccurate results, and misleading conclusions. That’s why there’s statistical analysis.
The flawed assumption here is embedded into the most common comparative description of RAID 5 and RAID 0: “If a drive fails, RAID 5 survives, while RAID 0 collapses.” It’s a true statement. But contains a false assumption. The two arrays, equal usable size of course, do not have an equal chance of one drive failing.
Perversely, you allow the common and incorrect assumption to pass by without scrutiny, while denying me the exploration of the statistically more likely scenario by way of demanding a study, and opining that only I find exploration of this scenario reasonable. Why do you find it unreasonable?
Alas, Google demonstrates yet again that my assertion isn’t an original one. This is not a study.
There are ways to mitigate the likelihood of a URE with RAID 5 and improve its reliability considerably, even when using consumer SATA drives. But that means having compatible error timeouts between drive, controller, and block device layer in the OS. This is non-obvious, which is simply why it’s better to not use consumer drives in RAID 5 (or 1 or 6 or 10 for that matter).
-
RAID 5 offers redundancy, RAID 0 does not. My assertion isn’t that RAID 0 is always more reliable than RAID 5, the assertion is that it can be. The assumptions I’ve offered for that case are reasonable ones, and also aren’t the ones you’re complaining about. e.g. I’m assuming high density consumer drives which have orders of magnitude more UREs than enterprise SAS drives, and you haven’t complained about that.
The assumptions you have a problem with:
There is a significant difference between RAID 5 and RAID 0 with URE. A degraded RAID 5 array will collapse, while a RAID 0 array won’t.
I’ve made no statement that enhances the weighting of RAID 5 going degraded, or that it would stay degraded. It’s a fact that a 6 disk RAID 5 has a greater probability of going degraded, than a 5 disk RAID 0 has collapsing, due to a drive failure. Naturally because of that higher probability, I’m exploring the outcome of a degraded RAID 5 vs a normally functioning RAID 0, in the face of a URE. That’s highly relevant to understanding the differences between the two.
If you look at the overall probabilities, RAID 5 isn’t hugely better or more reliable than RAID 0, when used with high density consumer HDDs, and can actually be less reliable. The point being, people who need redundancy should look to RAID 6 or 10 with conventional file sharing. A distributed file system implementing synchronous replication between RAID 0 arrays is also reasonable.
As for RAID 5 with hot spare being almost as good as RAID 6, I think it’s absurd. Degraded RAID 5 is not URE tolerant. A single disk degraded RAID 6 is. They as similar as a single engine airplane and a twin in the face of an engine failure.
Already six years ago Netapp asserted that “protecting online data only via RAID 5 today verges on professional malpractice” I’m amused just imagining what they’d call it today.
-
It’s a pretty straightforward probability estimate, I don’t think a study is needed. Most people familiar with RAID 0 know that the probability of total array collapse increases with the number of disks, a study isn’t required.
Assume identical model drives for the following:
The probability of a disk failure for a 6 disk array is greater than that of a 5 disk array. The more drives in an array, the greater chance of a disk failure. This is an unremarkable observation.
Even if this is a 6 disk RAID 5, and 5 disk RAID 0, the probability of a disk failure is unchanged. The RAID 5 array has a higher probability of encountering a 1 disk failure than the RAID 0 because of the additional drive. Technically the disks in the RAID 5 work harder than those in the RAID 0, due to RMW, but whether that has a significant statistical impact on reliablty would be up to a study to demonstrate. It’s a fair comparison because the useful storage is the same for both arrays.
Now compare a degraded 6 disk RAID 5, and 5 disk RAID 0. The RAID 5 has a distinct disadvantage now, if a bad sector is encountered. A URE on RAID 0 is a near non-event, maybe a file is corrupted, or the file system will need repair. With RAID 5, a URE means the array collapses. Even if you have specialized knowledge to recover the data, the purpose of RAID 5, uptime, is compromised in this scenario.
This is why so many data storage companies proscribe RAID 5 with consumer drives.
Of course, for the OP, he’s presumably converting from RAID 5 to RAID 0 with the same number of disks, increasing usable capacity, and increasing the probability of array failure. If there are other ways to mitigate this, it may still be an acceptable trade off.
-
With high density consumer drives, even RAID 5 is at least flirting with disaster. Either the drives don’t support configurable error time outs, or most RAID systems don’t set the drive error time out to an appropriate lower value, thereby ensuring an accumulation of uncorrected bad sectors and UREs. When one drive dies, any encounter of a URE during rebuild will cause array collapse and this is now quite common.
In fact with enough drives, RAID 0 can be more reliable statistically than RAID 5 when the arrays are the same usable size. That one additional disk required for a RAID 5 tips the probability against RAID 5. Really it should only be used with smaller density nearline and enterprise drives, and only with a consistent backup strategy. It’s why RAID 6 and 10 have become so much more popular with big data applications.
Most likely, for the OP, his application involves lots of small to medium file writes, which is penalized with RMW on RAID 5/6 but not RAID 0. Or maybe he’s instituting a daily (or hourly) replication, so that the RAID 0 can be used locally with minimal data loss should the array collapse.