Creative Communities of the World Forums

The peer to peer support community for media production professionals.

Activity Forums DaVinci Resolve Fastest Current GPU setup for Resolve – Rendering HD Resolution

  • Fastest Current GPU setup for Resolve – Rendering HD Resolution

    Posted by Gautam Pinto on July 20, 2013 at 5:32 pm

    I’ve read the config guide, and seen most of the posts and benchmarks available online. I’m now even more unsure than before I started. 🙂 I’m looking for the absolute fastest HD transcoding on OSX from Alexa Pro-Res 444 Log C > 5 Nodes > 709 LUT to Pro-Res LT.

    From what I gather, Titan is supported in 10.8.4, and 4x GPU’s.

    I’ll be using a 2010 12 core Mac Pro 5,1, 5 slot expansion chassis. Currently I’m using 3x GTX 580’s.
    One internal power version in the tower, and two in the expansion chassis. I’m confused about the best performing card setup and would like some real-world advice. Here are the configuration possibilities as I see it. Assuming I have one slot available in the Mac Pro, and 2 or 3 slots available in the expansion chassis. The other slots will be used for other hardware that I can’t remove from the chain.

    I prefer elegance over a 3% performance increase, and I prefer a lower cost to performance threshold. In other words, I’d prefer not spend 1 million dollars to gain a 3% performance increase. And I’d like to have the most optimal solution with less heat and power draw and less slots used up where possible.

    This is assuming HD resolution render speed as the primary performance metric, but it would be nice to have a seamless transition to 4K, and Resolve V10. For this application, rendering dailies, I don’t see myself going over 5 nodes plus a LUT. NR and Tracking would be minimal, if used at all, due to time constraints.

    I would like to have boot screens in OSX, so EFI flashed GUI card is a must, or OSX official Video card.

    Mac Pro Slot 1 = Quadro 4000, GTX 570, GTX 770, GTX 580 (EFI-flashed internal power version) or Radeon
    PCIe Chassis 2x GTX 690, 2x TITAN, 3x TITAN, 3x GTX 770, 3x GTX 580, or Other?

    I’ve seen that the 680’s and 690’s have had lower performance than the 580’s in the past. Is this still the case?

    Should I stick with the 3x 580’s and just wait until V10, since the astronomical cost of 3x TITANS, will not yield that much of a performance increase as compared to the 580’s?

    I’d only want to use cards with 3GB of ram or above. And yes, the other parts of the system will have high performance hardware including the RAID’s and CPU’s, RAM etc.

    And lastly yes I am using the Full version of Resolve, not the Lite version. And yes I do know that enabling GUI as GPU is now an option in R V9.1.5.

    Any advice to help me navigate through this is very much appreciated. I have a 1500W power supply in the expansion chassis, and double width slots so I don’t think I’ll have any power issues.

    I’d love to hear some real-world tests people have done that have turned out to be stable and powerful.

    I’m going to cross post this on LGG, and BMDF, as not everyone is a bovine scavenger.
    Apologies for cross posting.

    Toby Tomkins replied 12 years, 5 months ago 7 Members · 12 Replies
  • 12 Replies
  • Juan Salvo

    July 20, 2013 at 6:11 pm

    What you’re describing doesn’t require that much GPU power. I think your bottle neck is the speed of your drives and CPU, not your gpus.

    color/post/workflow
    https://JuanSalvo.com

  • Eric Hansen

    July 20, 2013 at 6:32 pm

    Gautam, with your current setup, what speeds are you seeing for:

    – FPS during render
    – CPU use in Activity Monitor
    – MB/s in Activity Monitor

    as i mentioned in your earlier post in the RAID forum, you need to find your current bottleneck because as Juan mentions, HD ProRes to HD ProRes is not that demanding.

    question for Juan – is that much GPU power just being wasted with PCIe 2.0?

    personally, i would wait for Resolve 10 before purchasing new GPUs. the BMD reps on the forums have been alluding to processing changes.

    e

    Eric Hansen
    Production Workflow Designer / Consultant / Colorist / DIT
    https://www.erichansen.tv

  • Helge Tjelta

    July 21, 2013 at 10:47 am

    I agree with juan here.

    I build a system with a 2012 mac, 32 GB RAM and Cubix expantion box. this one had 1 RED Rocket and 3 GTX 570 2.5GB version. GUI was on internet ATI 5770. Also I had a BMD card and FC card internal.

    This system did a really great performance, “never” ran out of nodes on HD. And 4K RED, ran realtime as well with a lot of nodes…

    Disk I/O was on a Xsan system (normally 150-200 MB/s) so not that fast. But no problem for Resolve to do it’s job.

    Cardwise; performance vs price…

    https://www.videocardbenchmark.net/high_end_gpus.html

    Helge

  • Gautam Pinto

    July 22, 2013 at 8:38 pm

    Thanks Juan, what I am wondering is if increasing the GPU power will get me faster FPS at render time. Real time grading is no problem, and I don’t need any more GPU power for that. However, turning around dailies as fast as possible is a requirement, and I wanted to know if I can get upwards of 150FPS by increasing GPU power.

  • Gautam Pinto

    July 22, 2013 at 8:46 pm

    Eric, thanks for your response. Since I’m putting in a new RAID, and removing red rockets this week, I’ll do some tests and post back. Right now, I’m getting around 90FPS for renders, but the GPU meter goes red on occasion, and activity monitor never shows full CPU utilization. I’ll do some further testing late in the week.

    I’ll try and experiment with 3x 580’s in the expansion chassis, with a quadro GUI, or 2x 580’s in the chassis and one 580 as GUI / GPU. Since I have a raid controller in the chassis, I’ll see what impact this has on performance also.

  • Laco Gaal

    July 23, 2013 at 7:11 pm

    Have you considered using two machines?
    An iMac can do 60fps..

  • Juan Salvo

    July 23, 2013 at 8:44 pm

    Please reread my post. The problem is not your gpu. It’s your CPU or maybe drives… From the sound of it, I’d say your CPU. The gfx is not what’s holding you back, it’s just the laws of physics.

    So no, a faster GPU will not make your renders faster.

    color/post/workflow
    https://JuanSalvo.com

  • Eric Johnson

    July 24, 2013 at 4:23 pm

    [Juan Salvo] “It’s your CPU or maybe drives… From the sound of it, I’d say your CPU”

    Juan:

    Your comment raised an interesting question, at least it did for me… Is there a way to determine at what point your CPU becomes the bottle neck?

    I am able to determine, within an allowable margin of error, the approximate “frames/sec” my drives are able to achieve as a product of data transfer, but that does not account for per frame encode/decode or any other CPU processing that is happening. Nor does it account for what my GPU is processing…

    For example:
    I have a RAID5 that does 280 MB/s, according to the AJA system test for a 16GB file.
    @ DNxHD 175/x or Pro Res (HQ), 24fps (not 23.976) for the sake of discussion, that is approximately 290fps (depending on how you do the math and were you may or may not round up, I did it a couple of ways and got between 280 and 310… Pro Res being VBR I feel ok using 290 for this discussion)
    In Resolve I can get around 60fps on shots w/o grades (if I remember correctly, this is mostly from memory)….

    Based on that information, using Resolve strictly as a means to transcode/transfer media, I am operating @ 20% of my drive speed.

    In this situation I know my GPU is on the low side of processing power, but that should have limited impact on what the CPU is actually able to process… so is there a way to determine what portion of that 80% loss is a result of what the CPU is doing?

    The system in question is similar to the OP’s… mac5,1 12×2.66. 26GBs RAM. 5770/Q4k. OS X 10.7.5. Resolve 9.1.3. eSATA-II RAID5 (8drive).

  • Juan Salvo

    July 24, 2013 at 5:53 pm

    Think of it this way. There are four seperate containers at play.

    -drive speed
    -CPU
    -GPU
    -Bus/Memory bandwidth

    Each container has it’s own limits, but in aggregate the limit is whatever the weakest link is.

    Once you get to ~90 frames (depending on data rate/compression type) you start to max out the number of frames your CPU can decode and encode simulatinously in ProRes. Different codecs would have different limits. And of course efficiencies in the way the codec is written make a difference too. Basically this stuff gets complex, but 90-100 frames is insanely fast for a box that is 3years old (at best)! 🙂

    color/post/workflow
    https://JuanSalvo.com

  • Eric Johnson

    July 24, 2013 at 6:43 pm

    The basic containers I understand (as far as where overall performance can be impacted), but where I get a little “iffy” is trying to equate CPU/RAM speed to R/W, encode/decode or buffering… since there is a disparity in the speed metric… B/s vs Hz.

    Of course this gets additionally muddled when you take into account that CPU clock speed is a soft number, with every new chip operating at near similar Clock speeds but being differently optimized for multithreading or per chip core count…

    Is there a way to force encode/decode a particular codec only on the CPU to determine how that codec performs? Obviously the results would be slightly skewed for the system being tested, because of the aforementioned limiting factors, but if I could determine that my systems encode/decode of a particular codec on the CPU level is X, then it would be possible to know if N GPU’s is optimal based on Y render results… there would still be variables of course, but the general principal remains true…

    Beyond all of that though, knowing that 90-100 Pro Res (HQ) 1080 frames is a lofty goal then I know more than I did. Which is always helpful.

Page 1 of 2

We use anonymous cookies to give you the best experience we can.
Our Privacy policy | GDPR Policy