-
Drop-frame issues persist… What did I miss??
Greetings to you all. I’ve been lurking here for a long time, but I’ve decided that I need to ask for a bit of help from the community.
I run a small system integrator in NYC, and I usually build render farms, high performance workstations and storage servers for 3D companies, and more mundane non-media clients.
I got a referral to a client that was setting up a Final Cut shop on a tight budget, and so I went about the calculations.
He wanted a centralized storage server with at least 16TB capacity, and he needed to connect 3 edit suites to it.
_________________________________________________
Here’s what I built out:
3x 8 core Nehalem Mac Pros as workstations
each with a Myricom 10Gb CX4 PCIe 8x NIC
a 6 port 10Gb HP managed ethernet switch capable of (and set to use) 9k jumbo packets
and a PC server.
I used a PC for a few reasons, primary among them being price and familiarity, and the availability of chassis which would allow me to direct-connect all the drives in my array.
The server is:
CPU: Xeon W3520 – 2.66GHz hyperthreaded quad core
RAM: 6GB DDR3 1066 ECC
NIC: Intel Dual Port 10Gb CX4 adapter (the same hardware as the Smalltree dual CX4 adapter)
RAID controller: Adaptec 51245 – 12 port SAS with dual core 1.2GHz engine and 512MB ECC battery backed cache
RAID HDDs: 12x Seagate Barracuda XT ST32000641AS 2TB 7200 RPM 64MB Cache SATA 6.0Gb/s (on the controllers supported hardware list)
System Drive: Mirrored 60 GB SSDs
__________________________________________________
Configuration:
The Macs:
OS X 10.6.4.
Based on the Myricom cards readme file, the jumbo frame size has been set to 8244 bytes: Quote
“For better TCP performance, it is necessary to increase the TCP window size beyond the default value.
To make this change permanent, edit (or create) the file /etc/sysctl.conf with this line in it:
kern.ipc.maxsockbuf=2097152On MacOSX, as with most BSD based stacks, restricting the TCP maximum segment size (MSS) to an even multiple of the mbuf cluster size keeps things nicely aligned, and results in improved performance when using jumbo frames. Unfortunately, the MacOSX TCP stack does not do this (Apple Bug Id #4919145), and the only way to do this is by adjusting the interface MTU by hand. The most common TCP packets will have an rfc1323 timestamp option, making for a header size of 52 bytes. Therefore, setting the MTU to 8192 + 52 (= 8244) results in optimal performance.”
…Who knew… Well, I did as they suggested, both with the MTU and the maxsockbuf
The Server:
Windows Server 2008 R2 Standard
RAID: 12x 2TB drives in RAID 6 = 20TB capacity
2x 10Gb links from Intel NIC to switch configured in to a 20Gb static trunk
Jumbo MTU manually configured to 8244 bytes so the jumbo packets will be aligned and not fragment
__________________________________________________
Initial internal benchmarks of Server:
System Drive: 200MB/s sustained, .12ms access time
RAID: Read: 683 MB/s sustained, 693 MB/s average, 809 MB/s peak. 12 ms access time
Write: 558 MB/s sustained, 599 MB/s average, 641 MB/s peak.__________________________________________________
At first I tried benchmarking SMB. Obviously its performance is atrocious, but I figured with so much horsepower on both ends, and such a fat pipe everything would be fine.
I set up a 6GB RAM drive on one of the Mac Pros and started hauling huge files back and forth.
My total transfer speed never got above 135 MB/s Server -> Mac, and 120 MB/s Mac -> Server.
While this is a tiny fraction of the available bandwidth I thought it would be ok, but when testing projects on the FCP workstations, I get dropped frames from time to time (admittedly not too often), even when dealing with 1080i ProRes422 HQ (~31 MB/s).
Strangely, When doing tests with 10 bit YUV 1080i footage (~ 166 MB/s), I get a very similar frequency of dropped frames.
So… I brought in a more novel approach, at least for testing.
I decided to try out iSCSI.
The Mac clients are using the GlobalSAN iSCSI host 4.0.0.204
The Server is running the free version of the StarWind iSCSI target, version 5.4
I set up two 2TB slices on the server, one for each of the two primary edit bays. These are static size image files.
I mounted each slice on each of the FCP workstations, and formatted those slices HFS+
This means the Windows server is now directly hosting HFS+ partitions through iSCSI, cutting out a ton of the intermediary nonsense.
This dramatically boosted transfer performance. Server -> Mac (RAM drive) is now at around 320 MB/s, and Mac -> Server is around 280 MB/s
These rates drop to about 250 MB/s and 230 MB/s respectively when you hit the server from both workstations simultaneously. (hitting a random-read bottleneck on the RAID array I think)
These performance numbers should be pretty awesome for most things, except that the macs are still dropping frames, especially when they’re both trying to pull 1080i 10bit YUV (166 MB/s), but even occasionally when a single system is attempting to pull 1080i ProRes422HD @ 32 MB/s, which is just stupid…
I got a utility that allows me to watch TCP/IP stats in real time, and this allowed me to trouble shoot a fragmentation error that was initially causing really poor network performance, but that was when I was trying to run everything with standard 9K jumbo frames. At this point it doesn’t look like I’m suffering from debilitating TCP fragmentation any more.
So what the heck is going on??
Has anyone experienced this sort of issue before?? Is there some sort of magic bullet in FCP that I missed that creates bigger buffers and is more tolerant of jitter? Am I missing some configuration issue? Did I not properly spec some piece of hardware?
What am I missing???
What sort of performance should I be expecting out of this setup? Shouldn’t I at least be able to do 1080p ProRes422HQ through SMB?? Shouldn’t I be able to do 4K ProRes422HQ through iSCSI?
If I have to use iSCSI permanently will MetaSAN allow me to share the same volume amongst the three workstations??
_____________________________________
ALL YOUR HELP IS GREATLY APPRECIATED!
– /aron
_______________________________________
ARC Systems Consulting – Brooklyn, NY