Activity › Forums › Storage & Archiving › Hitachi SMART status temps are way wrong
-
Hitachi SMART status temps are way wrong
Posted by John Davidson on February 1, 2012 at 2:33 amHi all,
Testing out a new system with ATTO R680/ ProAvio 8MS and 3TB Hitachi Deskstar drives. ATTO Config’s SMART status is saying the drives are 170-211 degrees, which ATTO says might be why we’re getting erratic RAID System Test speeds. Using the finger method, these drives are not that hot at all. It’s reporting these temps even when the drives are cold and freshly turned on.
Does anyone have 3Tb Hitachi drives installed? If so, can you tell me what SMART status says their temperature is?
ATTO says the temperature might be fine and that Hitachi might be formatting the SMART temperature in a way where the drives are 100 degrees less than SMART says, but I can’t find if that is accurate anywhere online. If that is the case, I may have a faulty drive, because something is causing erratic speeds out of our R680. We have a R380 w/ 2tb drives in another room, and the AJA system test graph shows beautiful even lines for read and write, but the R680 graph look like a seismologists worst nightmare.
My RAID is configured using settings given to my by ATTO. Initially, we got one crap drive that kept faulting the initialization and we had to RMA it, perhaps there’s a second? Sometimes the RAID just unmounts in the middle of a system test, which is not good.
Thanks in advance!
(* Hitachi claims these drives are excellent for arrays, the 3Tb Deskstars are recommended on ProAvio’s website for use in the 8MS enclosure, and I was personally recommended to them by the awesome ProAvio tech support.)
John Davidson replied 13 years, 4 months ago 4 Members · 13 Replies -
13 Replies
-
Petros Kolyvas
February 1, 2012 at 5:04 amATTO is correct regarding the temperature possibilities:
100 – temperature is a possible SMART temp readout format. (This would however, put your drives below zero.) Having said that we get similar readings from WD RE drives on our R680. I wouldn’t pay much attention to smart … it’s often useless, we’ve seen disks fail that were reporting nothing but “in the clear” with SMART. Enclosure temperatures are a much better signpost in my very humble (and not so experienced) opinion.
From the SMART entry on wikipedia: (https://en.wikipedia.org/wiki/S.M.A.R.T.#Information_Provided)
190 0xBE Temperature Difference from 100
Value is equal to (100−temp. °C), allowing manufacturer to set a minimum threshold which corresponds to a maximum temperature.I can’t speak to the inconsistencies in performance however. I hope you find the cause!
—
There is no intuitive interface, not even the nipple. It’s all learned. – Bruce Ediger -
John Davidson
February 1, 2012 at 10:31 pmInteresting, that appears to be for code 190. Code 194 is the one giving us warnings, and it doesn’t seem to have a 100-temp label:
0019 10:48:13 WARN Disk [MK0311YHGK8M4A] SMART attribute 194 worst is now
127
0020 11:48:13 WARN Disk [MK0311YHGLNS9A] SMART attribute 194 worst is now
125
0021 11:48:13 WARN Disk [MK0301YHGM9WKD] SMART attribute 194 worst is now
122
0022 11:48:13 WARN Disk [MK0311YHGLW9ZA] SMART attribute 194 worst is now
162
0023 11:48:14 WARN Disk [MK0301YHGLZ0PA] SMART attribute 194 worst is now
150
0024 12:48:13 WARN Disk [MK0311YHGK8M4A] SMART attribute 194 worst is now
125
0025 12:48:13 WARN Disk [MK0301YHGLBLPA] SMART attribute 194 worst is now
136I have two additional drives coming in to replace one that was bad and another that seems to have a higher temperature than the others. We’ll see what happens….
Thanks again Petros!
John Davidson | President / Creative Director | Magic Feather Inc.
-
Bob Zelin
February 2, 2012 at 1:39 amHi John –
I just had a similar thing happen with the R680. And I called up ATTO, after saying “I installed a second card, and the same thing is happening – what the hell is wrong with your card”.
It turned out that nothing was wrong with either R680 card. I have never seen this, but the fans in the Mac Pro stopped working, and within 15 minutes, the temerature warnings came up.
Open the side of your Mac Pro, and make sure ALL your fans are running, while you have power on.
Bob Zelin
-
John Davidson
February 2, 2012 at 2:21 amMy understanding is that the ATTO cards are supposed to crank up the internal fans to full blast to compensate for the lack of on board fan. My issue is related to drives in a RAID 5 ProAvio enclosure. As we had a second room with WD drives and a second identical 8MS enclosure, this is what we’ve done to isolate the issue:
1. Swapped enclosures. It’s not a faulty enclosure issue, drives still claim to be hot.
2. Plugged the RAID into a R380 card. Drives still look hot ( ATTO Config SMART says 211 degrees on some drives, even when cold).
3. Swapped drives with 2Tb WD Caviar Black into the R680, all chilling at a nice 111 degrees. It’s not an ATTO issue.At best I can isolate the issue to being specific to how Hitachi Drives report SMART status temperature, as in real life they are not hot. With this in mind, I’m just disabling SMART status monitoring, as it seems to be only really good at giving incorrect information.
We just got two new drives in. I’ll be removing the drive claiming to be the hottest and rebuilding the array tonight. With a little luck, we’ll be back to nice even read / write lines and can move on to deal with other looming disasters.
John Davidson | President / Creative Director | Magic Feather Inc.
-
Jon Schilling
February 2, 2012 at 9:40 pmWe did some testing and are seeing the same results, indication of “High Heat”, while in actuality the drives were well within the correct operating temperature parameters with both the R380 & R680 cards.
Latest as of 2/2/12 at 2:04PM PST:
I just got wind of some new drivers. I believe John Davidson will post on an update to his original post.
Jonathan Schilling
Vertical Sales Manager
ProAvio
Main: 562-777-3488 X106
Fax: 562-777-3499
Email: jon@proavio.com -
John Davidson
February 3, 2012 at 12:28 amSo, new drivers came out, I installed them, my RAID faulted, then rebuilding crashed, and now I’m just reinitializing the entire RAID and hoping that it doesn’t report ‘faulted’ during initialization and require system restart over and over again as it did last night.
I installed the 64 bit flash firmware, and then noticed that my Lion kernel was at 32 bit, so I changed that and am now officially running 64bit kernel. I was not aware that Lion still ran 32 bit kernel.
ATTO said some people had problems with the firmware released in December. The new firmware is dated 1/24/12.
Hitachi has no idea how their drives report smart temperature. None. I’m not actually surprised about this. Big corporations are awesome.
Luis and Jon at ProAvio are awesome, as always. Thank you for helping me!!! Steven from ATTO has also been really good about directing me to potential issues. Hopefully their new firmware fixes all.
As a last resort I still have the old Highpoint RAID card laying around. If I get more faults on the current initializing (right now at 12% with no faults yet) I’m going to rebuild the RAID using that card to make sure it’s not a bad drive (I’ve replaced three already, but I don’t even know if they were really bad). If it weren’t for the fact that ProAvio gets such excellent performance from the combo (Read/write is 1000 and the graph line is beautiful) I’d have run for the hills.
At this point it’s entirely possible that I have a wonky R680. While hard drives are definitely in short supply, I just can’t imagine that I’d get so many bad drives in a single delivery. After the next initialization and running the RAID Maintinance App from ATTO, if there are STILL issues, I’ll take out each drive and put them inside a mac pro. At least that way I’ll be able to officially rule out the drives as the culprit.
Here’s a grab of the AJA speed test results I was getting. If I used the 16Gb test, sometimes it would just knock the RAID offline. Lame.

This is what Luis at ProAvio is getting with the same setup. Look at that gorgeous line!

Forgive the length of this, I’m merely trying to document as much as possible for other readers who encounter issues in the future. I’ll continue documenting further developments.
-
John Davidson
February 4, 2012 at 4:38 amLast night at 2am the initialization completed with only a single error requiring a restart to continue. I then ran the RAID maintenance utility from ATTO to confirm it was healthy. The utility said the RAID was healthy. Then I ran AJA System test – and on the 2,4,6 and 8 gig tests results were somewhat erratic, but halfway through every single 16g test, the RAID would unmount, I’d get a warning about improperly ejected disk, and then a minute or two later it would remount.
So I went into the office at 2am to swap out the RAID card with the older Highpoint 4322 that had been in the system for a year until last weeks “upgrade”.
Initializing didn’t work on the Highpoint for some odd reason, but all drives were seen and reported excellent temperatures and healthy smart status. So today I took out each drive, dropped them into a mac pro one at a time, and formatted/tested individually to see if there was a problem. Each drive has it’s own ‘personality’ on AJA system test, but all were essentially normal. I have screen grabs of each.
I put everything back in, went back into the highpoint manager to create a new RAID, skipped initialization and used old data, and instantly the drive was there and ready to rock. Speed tests were slower than the R680 when it works, but were more consistent than I’ve actually ever gotten from this highpoint card. For whatever reason, the Highpoint wouldn’t complete the initializing, but at least I knew the drives were fine. I’m not interested in using the Highpoint any more anyways.
I also switched the cables with another room and added a new one, just to rule out the cables.
The resolution to this snipe hunt is that I grabbed the ProAvio box w/ drives, plugged it into the other edit suite with a R380, and lo and behold all my drives look good and it easily and swiftly began the initialization process via Express Initialization, which lets you mount a drive and use it at a slower rate when initializing. The RAID works great now. No dropping. No faults.
Obviously, I got a bad R680. It’s a bit annoying that “do I have a bad card” was the first thing I asked ATTO before they told me my problem was possibly related to the heat warnings. Ironically, their software that isn’t capable of telling the actual temperature of a drive (even Highpoint seems to have that down). These 3Tb Hitachis have been out a while, are recommended for RAID by Hitachi, and ATTO should know that their system erroneously reports wrong temperatures for this very common drive. Further, I got no answer when I asked what could possibly be knocking the RAID offline, presumably because the answer is “You got a bad card”. This could have also been a bit easier if the download link for the R680 driver and maintenance utility had actually worked on the ATTO website last weekend. I understand faulty products get put out all the time by every manufacturer, but if I made RAID cards, I’d test it with virtually every drive in the world – there are only 2 or 3 companies that make hard drives, after all.
The process of building a DIY RAID is always a bit challenging, but I did my homework and still got hammered by it. Areca, here I come.
For what it’s worth, our editing iMac w/ Promise Pegasus RAID5 came pre striped and worked right out of the gate, no configuration required. I hope ProAvio has a Thunderbolt enclosure coming down the pipes soon. Their tech support is fantastic.
-
John Davidson
February 7, 2012 at 10:06 pmYesterday we put in the Areca 1882x, which is a little overkill, but we’re getting a pretty awesome solid line on our AJA System Test graph at read/write of about 930Mb/s each. I’ll post a grab to show it off tomorrow when all our media is done copying back to it, because it’s awesome.
I really have to thank Jon and Luis at ProAvio again for their help. They’re responsibility was only with the enclosure which was never the problem, but they were still incredibly helpful in getting me up to speed and even offered to configure the system at their offices! Be sure to hit them up at NAB!
The bright side of all this is that I’ve learned Areca, Highpoint and ATTO setup procedures for RAID cards.
John Davidson | President / Creative Director | Magic Feather Inc.
-
John Davidson
February 9, 2012 at 8:18 pmATTO is not providing me with an RMA. The reseller doesn’t want to take it back without one, but as this was an amazon purchase, it looks like I’m going to have to eat a 20% restocking fee.
I think the worst thing about this is that I was a big ATTO fan. https://forums.creativecow.net/thread/71/861027
So let me show you the performance of the Areca 1882x card, which installed, initialized and worked right out of the gate with no problem.
This is a little off of the 900 Mb/s I was getting on an empty RAID. The RAID now has about 10 Tb’s of data on it. -
Petros Kolyvas
February 9, 2012 at 8:31 pmThat’s sad to hear. Apparently Areca has very highly regarded support as well!
—
There is no intuitive interface, not even the nipple. It’s all learned. – Bruce Ediger
Reply to this Discussion! Login or Sign Up
