Activity › Forums › Storage & Archiving › Direct connect Fiber-PC ?
-
Direct connect Fiber-PC ?
Posted by Clayton Botkin on November 23, 2010 at 6:21 amHi – I’m a very novice SAN/fiber student. (with very advanced math modeling efforts).
Problem: Statistical software system (SAS) is used in a very small shop to process and analyze 200 million records. The data is 200-300 gig files and This software is I/O constrained. System is power PC-multi-core 6-24Gig Ram. CPU cores are “fine”… Ram is “fine”… Problem is data storage is all via 1Gig Ethernet connections to NAS 1T storage drives.
While this situation not needing ‘shared’ workspace – it is very similar to dynamic video editing with massive I/O (on single control PC).
Solution: ? : single PC (no need to share now) – with more “direct” connection (san/fibre) 4G? with 400/meg/sec throughput to 1-4T storage drives.
How? For less than?
How to find these entry-level components?
Need a prescription. Thanks.John Douglas replied 15 years, 4 months ago 5 Members · 10 Replies -
10 Replies
-
Bob Zelin
November 23, 2010 at 7:50 pmYou are in a university enviornment, who has more money than any private company to invest in technology. Please don’t tell me that you need a “budget” solution.
You don’t need shared storage – correct – just a fast data pipe to go from your PC to your storage array.
You have two solutions – Fibre Channel host adaptor (from a company like ATTO Technology) to a fibre array, or direct connect from your PC to a storage array, using a SAS/SATA host adaptor (from companies like LSI Logic, ATTO Technology, Cal Digit, Areca, Highpoint, etc.).
A direct connect SAS/SATA port will give you about 700MB/sec, not the 60MB/sec you are currently getting right now from your ethernet based NAS.Exactly what drive array to you intend to use ? You can take the drives out of your NAS box, and put them into a different chassis with either a SAS interface, or Fibre interface, and dramatically increase your throughput.
Bob Zelin
-
Chris Gordon
November 23, 2010 at 11:58 pmA couple of questions first:
– Which NAS are you using now?
– What are the specs on it (number/type of drives, type/number of CPUs, amount of cache, etc)?
– Have you verified that the NAS device itself is not IO or CPU bound, not just the pipe between your machine and the storage array?
– Which NAS protocol are you using (SMB/CIFS, AFP, NFS, iSCSI)?
– What is the nature of the IO? Reads and writes? Sequential or random?
– You say “PC”. Does that mean your machine is running Windows or is it running some other OS?As you probably know, this is all a game of moving bottlenecks around. That means you need to find out what part of the chain is slow and fix that, then move to the next part until everything performs as you want. There are several places to look:
– Make sure your host is not memory/CPU/bus constrained (you’ve already done this).
– Look at the storage array itself. You need to first check to see if it is constrained somewhere. Remember, these aren’t magic devices, but just computers built for a very special purpose — servicing IO requests quickly and efficiently. Within the array, or any storage system for that matter, you get your speed by two methods. First is the number of disk spindles you throw at it. A given disk can service only so many IOPS before performance goes down. The more disks you spread the load across the more IOPS you can support. Second, you use cache (RAM). If you have a lot of bursty writes, you can absorb that in cache and get really good performance. In the low write periods, the data is then de-staged out of cache and written to disk. Cache can also help with reads if you read the same blocks repeatedly (say a reference table in a database). The array (assuming its OS is worth a darn) will hold that data in cache and service the read requests from there instead of constantly having to read from disk.
– Look at how you access the array. There are typically two ways to do this: file access protocols and block level protocols.
— File level protocols are things like SMB/CIFS, AFP or NFS which run over IP and present files to the client. File access locking and management of the file system itself are done on the array. These are typically simple and cheap to setup and run and can perform will in many/most cases.
— Block level protocols (Fibre Channel, SAS, iSCSI, eSATA) present a “raw” LUN to the host and the host manages the file system, etc. These are typically far more complex to setup and manage, but you can typically get much better performance out of these. The performance increase often comes from (a) more efficient protocols and (b) you are only sending the data/disk blocks back and forth between the host and array whereas with file access protocols, there is a lot more data going back and forth.So what can you do (assuming your only bottle neck is the network connection):
– Assuming the only bottle is your network connection, then increase that. Either bond several 1 GigE interfaces together (different OS’s call it something different: bonding, etherchannel, link aggregation, trunking, teaming etc) or move up to a 10 GigE network. If you do either of these, you need to make sure you can do something similar on the NAS side along with your network switches. If you go the bonding route, be sure that both your host and array will load balance across the different NICs and not just do simple fail over.
– Ditch the NAS and get a directly connected array with a decent RAID controller. This would put a lot of disks (spread out the IO) directly connected (local disks) to your host.
– Move to a Fibre Channel based solution. You’ll need your array to support FC, an FC HBA for your host and possibly a FC switch. Similar to ethernet, you can aggregate multiple FC connections to get increase performance. If you are going to go this route, you need to get someone that has experience with FC and SANs. It’s a bit of a different world than simple IP connections and you can easily shoot yourself in the foot — especially since this stuff isn’t cheap. This should be your last stop as its by far the most expensive.Hope that helps.
Chris
-
Bob Zelin
November 24, 2010 at 2:01 amvery good answer Chris Gordon –
what company are you with ?
I see this is your first post.Bob Zelin
-
Clayton Botkin
November 24, 2010 at 2:39 amHi – and thanks for the fast response… You are no-doubt a master of your technical world. This SAN/fiber/SAS world is new to me. I am actually a focused scientist type – trying to make a small 2-3 person consulting biz work. The ‘budget’ will literally be ‘out of my checkbook’. My skills and linkage to complex and very large datasets begun ‘in a university’… but that was years ago.
I’m clearly in over my head trying to learn SAS/fiber/SAS. I’m dealing with 2 PCs running Win7 ultimate (ugh-I know). But, I’m doing real intriguing modeling… on 200 gig files with the SAS-stats software. Statistical Analysis System. I’ve worked with the SAS-CaryNC tech team to learn enough that the software first makes a huge shadow copy of the database as it begins to process. Most data I/O are sequential. My PC-NAS storage system is great for holding all my research and files etc. (4-Netgear NV+ each holding 4- 1TB Seagate “ES” enterprise drives — but, only spinning 7-8k fast).I’ve been directly told that I’m I/O bound as I try to get the software to
process these large 200G files. –> Enter direct attached SAN/fiber/SASi or something.Thank you for the company/product suggestions… it has made for worderful reading/learning… great stuff. Now… recall… I’m a simpleton. I need to sift through this to try to purchase a card and a neat-o new cable of somekind – fiber ? SASi ? – and a new cage of SAN or SAS drives. I have a good handle on ‘standard’ NAS hot-swap drives… but, no idea about SAN.
I can see that the card possibly could be < $1000 (LSI/ATTO) – in the 6GB/1 port or 2 port flavor. I’m fine purchasing a new cage/box for a set of drives. But, need more specifics since I’m just so new to this. My budget may be just $5K… for connecting one PC directly. Is this possible? I’m even ok with something secondhand – but, I do see the latest product connect cards are boasting 6G throughput.
Thank you very much for your time.
-
Clayton Botkin
November 24, 2010 at 2:58 amHi Chris,
Geez you guys are such experts up here… I’m just a guy trying to get beyond NAS and I’m studying like heck to understand all this. Thanks for the response.
I’m just running a simple small-office network with consumer stuff. Netgear managed switch with 4-Netgear NV+ NAS units each with 4- 1TB Seagate “ES” drive. 16TB storage. I did up the mem sticks to 1Gig in all 4 units. This NAS solution is great… I love it. But, it’s not appropriate for the 200Gig statistical databases I’m pushing around. Plus the SAS-stats software is typically loaded onto university sites, and frankly either at companies or at universities… it’s never fast. Why – i think for the same reason I dealing with here. Never enough SAN/fiber/iSCSI drives with ‘direct connections’. Mostly always via a ‘connection’ or lan or something.
I think my need is beyond the NAS drives.
The IO is sequential I think — the software loads huge data as it courses through the code that we execute. The data (SAS-stats) system is designed to run sequentially — top record down to bottom. I believe the IO is the same.
The PCs are simply a Win7 Ultimate – high-end i7 4-core – 6-Gig ram… the other one is dual Xeon – 12Gig ram… they are “fast” – and the larger Ram does help. Your discussion on CACHE and file level protocols is wonderful… I need to study to understand this better.I do agree that once I solve one bottleneck – there will be another (I believe that with 10x increase from direct connect SAS/Fiber then the next issue will be Ram – and I am planning a newer PC-server box with multi-XEON that will allow up to 48 Gig Ram).
But, I’ve worked on this issue with the SAS-CaryNC guys and they tell me directly that I’m IO bound (due to NAS) and SAS/fiber is best.
My real challenge is how to get something (small) direct connected to one PC for reasonable (non-corporate) money.Any directions or ideas help — and Thank You !
-
Chris Gordon
November 24, 2010 at 12:29 pmBob,
Thank you for the compliment.
I work for VeriSign as a System Architect designing highly scalable and highly available (at least we hope) infrastructures. (Corp disclaimer: anything I say is my thought or opinion and in no way connected with my employer.) My LinkedIn profile is https://www.linkedin.com/pub/chris-gordon/1/541/788.
With respect to video, I’m a bit of a neophyte. It’s a hobby that I’ve dabbled with from time to time and increasingly enjoy. I’ve been following a number of forums on the COW and a few other places to educate myself on the art and science of video.
Thanks,
Chris -
Chris Gordon
November 24, 2010 at 12:59 pmThose are little SOHO NAS arrays. Perfectly fine for sharing some documents or music at home or in a small office, but not for any heavy IO work. They typically have rather weak CPUs in them (a single core Atom if you’re lucky) which has to service all of the IO, manage the RAID (mirroring, parity, whatever you set) and run the CIFS/SMB server (assume that’s what you’re using). For the amount of data you want to move, this thing just can’t keep up.
What I would recommend doing is to move from a NAS solution to something directly connected. This means an enclosure to hold the drives and a controller that sits in your computer to connect to the enclosure. What you get is going to depend on your budget. You can go to somewhere like Newegg.com and get the parts and build it yourself with the drives you have from your NAS or buy solutions. Bob mentioned a couple of companies in his initial reply and there are others such as G-Technology or even Dell and HP (look at the server storage solutions). I’d recommend getting a packaged solution (with support) instead of the build your own (to get good performance and reliability, you need some expertise).
Some other things to consider:
– Never fill your drives all the way full. First, any file system needs free space to maintain itself. Typically you never want to go over 80% full, but that can vary. Second, you end up taking a performance hit as the drive gets more and more full. Remember a disk is a spinning platter and you have to move the drive head over it to read or write. With more data on the disk, you’ll probably be reading/writing to it more. That means more time you wait for the head to get positioned properly.
– You have SATA drives now. These are good for holding large volumes of data cheaply and tend to perform relatively well for large sequential IOs. They can’t keep up with extremely high volumes of IOs or a lot of random IO. In those cases you need to move to faster disks — SAS or FC (there are disks that connect internally to the array with FC and this doesn’t mean specifically a piece of glass coming out of your PC to an array, though it typically ends up there) spinning at 10k or 15k RPM. These drives are smaller in capacity and more expensive than SATA disks.
– Which type of RAID are you using and why? Each of the RAID types has trade offs between performance, cost and redundancy. The Wikipedia article on RAID can give you some background on the different types. Also remember that RAID is NOT a backup solution. You are backing up your data somewhere, right?
– You may want to hook up with a storage expert to help set you up. If you’re in a university, there are probably people on staff there that could help (bribe them).Hope this helps a bit more. There isn’t a magic bullet here, each solution is a balance of what you need and can afford.
Chris
-
Bob Zelin
November 24, 2010 at 6:46 pmChris is correct, and I will repeat his important words, that simply answer your question –
“What I would recommend doing is to move from a NAS solution to something directly connected. This means an enclosure to hold the drives and a controller that sits in your computer to connect to the enclosure.”
Let me make this simple for you – see the NAS you have right now ? You ain’t gonna make this faster. If you don’t need shared storage, GET RID OF THE NAS. The very same drives INSIDE THE NAS BOX YOU OWN will work DRAMATICALLY FASTER without the ethernet connection. If your question becomes – “but I have this NAS box, what can I buy to make it faster” – the answer is NOTHING. It’s the wrong box for your application.
Bob Zelin
-
Steve Modica
December 30, 2010 at 4:27 pmI once had a doctor call us complaining that he spent $20k to upgrade his SGI system to faster graphics, but his 3D modeling was no faster (and in fact, slower). He was pissed.
We looked at the IO coming from the app he was using. It was reading a record, then writing it to the screen, reading another record, then writing it to the screen (and so on). His bottleneck was his 40MB/sec SATA drive.
When we had his programmer read many records at once, it was 6 times faster immediately. It would have been 6 times faster on the old machine too.
So in your case, I think you should profile the IO load before you look for a formula or you will find the wrong answer.
Steve
Steve Modica
CTO, Small Tree Communications -
John Douglas
December 31, 2010 at 7:18 pmUnfortunately you will most likely not be able to add a 4Gb interface between your current Netgear NAS and the WIn 7 PC. The Netgear only has GbE interface and will NOT accect an FC HBA. So the With your current set up, the most you could hope to atttain on the SAN would be by aggregating two ports to try to attain 2Gbe. The other issue is thiose are SATA drive only capable of 80IOPS.
For a 4/8Gbe FC controller interface on an entry level storage unit with 4-6 450-600GB 15K SAS drives would cost approx $8-10K starting out. A typical 4-8Gb FC PCIe HBA for the PC or server costs about $1000.
If you have a high I/O you can solve it with 10GbE NIC, FCOE, or 4Gb -8Gb FC HBAs. You qill definitely need a higher IO hard drive like a 15K SAS drive. They have over twice the IOPS that a SATA does, or if budget allows find a sysem that uses SSD. SSD can acheive 2500 IOPS or greater. If you woulf like to call please feel free.
JGDouglas
GovConnection
Technical Sales Consultant
8008000019 x75552
Reply to this Discussion! Login or Sign Up