New Mac Pro Resolve Benchmarks (D700 vs D500 vs 2010)

Discussion in 'CPU & GPU' started by Jason Myres, Jan 7, 2014.

  1. Jason Myres Moderator

    Message Count:
    2,321
    Location:
    Los Angeles
    ***Updated Jan 20th with Complete Benchmarks for both the D500 and D700 2013 Mac Pro***

    Comparison results for the 2010 12-Core using single and dual AMD R9 280x's are now also included. For additional background on the test and a detailed analysis of the results take a look at the latest Coloristos Episode over in the Media Forum:

    Coloristos Episode 15 - "New Mac Pro"

    http://liftgammagain.com/forum/inde...ristos-colorcast-episode-15-new-mac-pro.2310/

    -----

    I had an opportunity to test a new Mac Pro yesterday at New Media Hollywood, a post integrator here in Los Angeles. The unit I worked with was the stock 3.5GHz 6-Core D500 available for $3999 at the Apple Store. I was able to run a few different benchmarks, including the Resolve 9 Standard Candle, as well as a variation Juan's been working on called the Standard Flashlight, some Sapphire Plugin benchmarks, a 6-Node HD encoding test, along with playback performance for a number of cameras.

    For comparison, I ran all of the tests on my 2010 Mac Pro 12-Core 2.66GHz 24GB with a stock Nvidia GTX 780 3GB in both OS X 10.9.1 and 10.8.4.

    I should have the chance to test a 12-Core D700 when it arrives in a few days. In the mean time, the benchmarks below should help give you an idea of what you get for $4000.


    [New Jan 7th]

    The primary goal was to see where the new Mac Pro stands compared to a legacy Mac Pro. 

The new Mac Pro is different enough (Mavericks, OpenCL, AMD) that trying to create a one-for-one comparison doesn't seem as useful as something more practical like "Is my money better spent upgrading my legacy Mac Pro, or buying a new one." That's a question on everyone's mind, and the concept the tests were built around.

    To quantify what you get for $4K, add-ons like a RAID array or additional RAM were excluded for now. The issue is also mitigated a by it's SSD storage. Its where the NMP simply destroys on all counts (except for capacity) and allows the machine to boot in just over 12 seconds. It's so fast one of the New Media engineers thought I had woken it from sleep when I turned it on. So, doing almost any test to/from the internal SSD is not a bottleneck.

    For comparison, I used a 2010 12-Core Mac Pro which is well-known, and popular with colorists and finishers. I installed the most modern CUDA GPU you can use without a PCI-E chassis or additional power, which is a stock GTX 780 3GB. It's an option a lot of people would probably look at if they decided to upgrade a legacy Mac Pro. Dual GPUs would be more "fair", but the goal isn't to beat the New Mac Pro in testing, it's to compare a well-known machine with an unknown one.

    The test is also a comparison of what we've been using versus what Apple would like us to use: AMD vs Nvidia, OpenCL vs CUDA, Mavericks vs Mountain Lion, Expandable vs Self-Contained. Each one represents a entire sea-change in it's own right, much less all of them taken together. It seemed the best way to begin was to compare a good example from each generation, and see where the numbers land.

    [End]

    [New Jan 20th]

    Addition of the Radeon R9 280x

    Single and Dual Radeon R9 280x configurations have been added which allow some useful comparisons (Thanks Jake).

    -Dual AMD D700 Firepros in a 2013 Mac Pro Vs Dual AMD R9 280x's in a 2010 Mac Pro
    -AMD R9 280x in OpenCL Vs Nvidia GTX780 in CUDA

    Interesting Note: The R9 280x in 10.8 appears as a "Tahiti XT Prototype Compute Engine", however in 10.9.1 is shows up as the "D700 Compute Engine". The R9 280x is basically identical to the original HD 7970 released in 2011. This verifies that the D700s in the 2013 Mac Pro are essentially HD 7970s with 6GB of Vram.

    Here is how dual R9 280x's appear in Resolve on 10.9.1 on a 2010 Mac Pro:

    Mac Pro 2010 GPU Dual R9 280x Resolve .jpg


    [New Jan 20th]

    Test Configurations

    1) 2013 Mac Pro 2.7GHz 12-Core/ 64GB/ 1TB SSD with Dual AMD FirePro D700s
    OS X 10.9.1 with Resolve 10.0.2 in OpenCL 1.2

    2) 2013 Mac Pro 3.5GHz 6-Core/ 16GB/ 256GB SSD with Dual AMD FirePro D500s
    OS X 10.9.1 with Resolve 10.0.2 in OpenCL 1.1


    3) 2010 Mac Pro 2.66GHz 12-Core/ 24GB with DUAL AMD Radeon R9 280x 3GB
    1.5TB Boot, plus Internal 4-Drive RAID5 Array (450 MB/s)
    OS X 10.9.1 Resolve 10.0.2 running OpenCL 1.2

    4) 2010 Mac Pro 2.66GHz 12-Core/ 24GB with DUAL AMD Radeon R9 280x 3GB
    1.5TB Boot, plus Internal 4-Drive RAID5 Array (450 MB/s)
    OS X 10.8.4 Resolve 10.0.2 running OpenCL 1.1


    5) 2010 Mac Pro 2.66GHz 12-Core/ 24GB with a Single AMD Radeon R9 280x 3GB
    1.5TB Boot, plus Internal 4-Drive RAID5 Array (450 MB/s)
    OS X 10.9.1 Resolve 10.0.2 running OpenCL 1.2

    6) 2010 Mac Pro 2.66GHz 12-Core/ 24GB with a Single AMD Radeon R9 280x 3GB
    1.5TB Boot, plus Internal 4-Drive RAID5 Array (450 MB/s)
    OS X 10.8.4 Resolve 10.0.2 running OpenCL 1.1


    7) 2010 Mac Pro 2.66GHz 12-Core/ 24GB with a Single Nividia GTX 780 3GB
    1.5TB Boot, plus Internal 4-Drive RAID5 Array (450 MB/s)
    OS X 10.9.1 Resolve 10.0.2 running CUDA 5.5.25

    8) 2010 Mac Pro 2.66GHz 12-Core/ 24GB with a Single Nividia GTX 780 3GB
    1.5TB Boot, plus Internal 4-Drive RAID5 Array (450 MB/s)
    OS X 10.8.4 Resolve 10.0.2 running CUDA 5.5.25



    ---

    Scoring Notes

    FPS scores represent the lowest sustained number of frames per second for each test.
    Percentages shown represent the highest sustained CPU load for each test.

    ---

    [New Jan 20th]

    Test Notes

    -Separate Video Card for GUI: 2010 Mac Pro tests were run with the GTX 780 alone, and with a GT120 for GUI, but results were the same in either case.

    -Use Display GPU for Compute: Resolve Testing was completed with "Use Display GPU for Compute" enabled, however in the 2013 Mac Pro having this setting enabled or disabled has no affect on performance.

    -Nvidia in OpenCL Mode: 2010 Mac Pro tests were run with Resolve in OpenCL mode, but the scores were so poor (i.e. 2-3 fps with 4 nodes of blur), they were not included in the overall results.

    -Media Drives: Media Playback for the 2013 Mac Pro was it's internal SSD (1200 MB/s), and for the 2010 Mac Pro was it's internal RAID5 array (450 MB/s).


    4K Playback in Resolve:

    -The 2010 Mac Pro with the GTX780 3Gb cannot play back 4K ProRes 4444 at 24 fps with no color corrections applied. Initially, it seemed there might be an issue, but Paul Provost verified the same result on his 2012 12-Core running a single GTX Titan.

    However, Juan Salvo was able to play back the same 4K ProRes 4444 sample at 24 fps on hs 2010 8-Core running dual GTX 590s. This surprised us, as up until this point we had assumed a Resolve project with no corrections applied wouldn't create a significant GPU load, but apparently it does. Once I added dual R9 280x's to my 2010 12-Core for testing the same project with no color corrections applied finally played back at 24 fps.

    [End]

    Test Details

    Resolve 9 Standard Candle
    Focus: GPU
    Project: (Resolve 10) 720x576 Source to 1080p/24
    Result: Frames Per Second
    Notes: The traditional benchmark for Resolve GPU performance using a combination of Blur and Noise Reduction Nodes. Results are based on the minimum sustained Frames Per Second by cycling through each Version of the test grade.

    Juan Salvo's Standard Flashlight
    Focus: GPU, VRam, System Bus Bandwidth
    Project: (Resolve 10) 4096 x 2160/ 59.94 Source to 1080p/24
    Result: Frames Per Second
    Notes: A new Resolve test developed by Juan Salvo. Similar in format to the Standard Candle, but designed to test the limits of modern, multi-GPU Resolve configurations from several new angles, including overall GPU compute, memory, and bus bandwidth performance. Results are based on the minimum sustained Frames Per Second by cycling through each Version of the test grade.

    Sapphire OFX
    Focus: CPU
    Project: (Resolve 10) 1080p/23.98 ProRes 4444
    Result: Frames Per Second (Total CPU Load)
    Note: Genarts Sapphire OFX Plugin based on Lens Flare, Film Effect, and Z Blur performance in an 1080p/23.98 ProRes 4444 Project. Presets were chosen based on CPU load and tested in order of difficulty from easiest to hardest. Machines with higher CPU clock speeds generally offered the highest scores.

    4K Playback
    Focus: GPU, System Bus Bandwidth
    Project: (Resolve 10) 4096 x 2160p/ 23.98 ProRes 4444
    Result: Frames Per Second
    Notes: 4K Playback in Resolve 10 created by adding Serial Corrector nodes with basic adjustments (Base, Hi Key, Lo Key, Vignette, repeat) to a 4K ProRes 4444 Project until the sustained playback dropped below 24 fps.

    4K Spatial NR
    Focus: GPU, Vram, System Bus Bandwidth
    Project: (Resolve 10) 4096 x 2160p/ 23.98 ProRes 4444
    Result: Frames Per Second
    Notes: Single node, 4K Spatial Noise Reduction test in Resolve 10. The test used two settings, one easier (Small radius, Chroma /Luma Threshold of 5) and one difficult (Large radius, Chroma /Luma Threshold of 10).

    HD Render
    Focus: CPU, GPU, Vram, System Bus Bandwidth
    Project: (Resolve 10) 1 Minute 1080p/23.98 ProRes 4444 Sequence
    Result: Elapsed Time in Min:Sec
    Notes: A one minute ProRes 4444 sequence in Resolve 10 with 6 nodes (Base Grade, Noise Reduction, Hi Key, Low Key, Color Push, Vignette), separately rendered out to a single 1080p/ 23.98 ProRes 4444 file, and then a H.264 file.

    HD Render Test Node Tree:

    Encode Setup.jpg

    ARRIRAW Playback
    Project: (Resolve 10) 2880x1620 Source to 1080p/23.98
    Result: Frames Per Second (CPU Load)
    Notes: Resolve 10 ARRIRAW Playback at Full, Half, and Quarter Resolution.

    Sony F65 Playback
    Project: (Resolve 10) 4096 x 2160 Source to 1080p/23.98
    Result: Frames Per Second (CPU Load)
    Notes: Resolve 10 Sony F65 Playback at Full-Resolve, Full-Sony, Half, and Quarter Resolution.

    Blackmagic 2.5K BMCC Playback
    Project: (Resolve 10) 2400 x 1350 Source to 1080p/23.98
    Result: Frames Per Second (CPU Load)
    Notes: Resolve 10 Blackmagic 2.5K BMCC Playback at Full, Half, and Quarter Resolution.

    Phantom Flex Playback
    Project: (Resolve 10) 2560 x 1440 CINE 10-Bit Source to 1080p/23.98
    Result: Frames Per Second (CPU Load)
    Notes: Resolve 10 Phantom Flex Playback.

    RED R3D Playback
    Focus: CPU Debayer Performance
    Project: (Resolve 10) 1080p/23.98
    Result: Frames Per Second (CPU Load)
    Notes: Resolve 10 RED Debayer at Full Premium, Half Premium, Half Good, ad Quarter Good.
    4K RED One 4K: (4096 x 2304)
    5K RED EPIC: (5120x2560)
    6K DRAGON: (6144 x 3160)

    REDCINE-X GPU Debayer
    Focus: OpenCL R3D GPU Debayer
    Result: Elapsed Time in Min:Sec
    Notes: REDCINE-X GPU Debayer of three different frame sizes to ProRes 4444/ 23.98 at the original source frame size.


    Final Test Results

    Note: Click refresh on your browser if the results image doesn't appear initially. PDF available below.

    Mac Pro Benchmarks.jpg

    Attached Files:

  2. Raffaele Mariotti

    Message Count:
    26
    Location:
    Italy
    very interesting!
    thank you for your effort, Jason.
  3. Frank Glencairn

    Message Count:
    216
    Did you run the tests on the new Mac from/to the internal drive or from/to some RAID?
    Alejandro Matus likes this.
  4. Margus Voll

    Message Count:
    1,094
    Location:
    Tallinn, Estonia
    we can clearly see that if cpu is a needed like on red stuff then old setup is tiny bit better.

    will be waiting for D700 test now ;)
    Jason Myres likes this.
  5. Margus Voll

    Message Count:
    1,094
    Location:
    Tallinn, Estonia
    Frank has valid point here also i think.

    I have seen that when i render to different arrays the result sometime differ
    10 fps.
  6. Tom Parish

    Message Count:
    59
    Location:
    Austin
    Thank you Jason. You guys put a lot of work into this. I'm vary curious to see the impact of the D700 cards.
  7. Esteban Aguilera Moderator

    Message Count:
    238
    Location:
    Madrid / Spain
    Awesome job Jason !!!!

    Did you do any test using the old Mac Pro and the option into resolve to work in OpenCL ?
  8. Eric Chun

    Message Count:
    1
    Location:
    South Korea
    Thank you a lot!
  9. Alejandro Matus

    Message Count:
    2
    Location:
    Barcelona
    Thanks Jason, the tests are very good!
    The question that Frank did, is the key.

    Since according to ATTO reports on thunderbolt 2:

    http://www.jigsaw24.com/news/wp-content/uploads/2013/12/TechBriefThunderboltComparison1.pdf

    The speed is the same as the thunderbolt 1.
    4K monitors can be connected, you can add more devices (twice), but the speed is the same.
    And the maximum speed can you get with thunderbolt is 714MBytes / s - 5712 Mbits / s for read and write.
    I check this with ATTO engineers at the last IBC

    In MACPRO 2.66 12 cores I can get up to 2000 MB / s - 16000 Mb / s for read and write (16 x 3TB Hitachi Deskstar SAS RAID 5).

    How much speed had the drives of the tests (Macpro 2010 & 2013)?

    How much speed is the internal disk of the new Macpro?

    We can't work in projects with big size files with the internal new Macpro disk, and thunderbolt RAIDs are not fast to move files like TIFF 16 bits 2K or 4K, we need more than 714 MB/s for DI work.

    We also need more speed that gives the thunderbolt to perform multiple tasks on set (copies, dailies,real time play, multiples cameras).

    This is something to consider before buying a new MacPro.

    thanks
  10. Eric B Johnson

    Message Count:
    153
    Location:
    Los Angeles, CA
    I find it pretty impressive that for the R3D decode, the new Mac is performing similarly as the 2010 at half the cores... I would be interested to see a Redcine X GPU decode between the two machines also...

    And as Alejandro mentioned, my biggest concern with the new Mac is getting media in and out of it at a file level.
  11. jake blackstone

    Message Count:
    690
    Location:
    Los Angeles
    This is all interesting info for many Resolve users thinking of making the nMP plunge. Thank you for sharing it with everyone.
    My biggest problem with this test is the wrong conclusion, showing that 2013 nMP with 6 cores is better than 2010 12 cores. The conclusion, that I took away from this test means to me, that two GPUs (D500) are better than one (GTX780).
    Because of that fact, any GPU rendering comparison is not really that indicative of the whole hardware platform performance. Now, if you had the same two computers chosen (2013 vs 2010), but this time both equipped with two comparable GPUs, that would be a better comparison. Also, i think it is a bit too early for getting excited, because number of questions haven't even been addressed yet, such as having a full blown Resolve system with TB connected RR-X, Decklink, may be some other SAN connectivity etc. I have no doubt, that strictly on CPU performance the new nMP is a winner. But once you start taking into an account many other important considerations (and there are many, that needs to be carefully considered) I wonder, if it's really worth the expense and effort, getting such system at this time yet.
  12. Eric B Johnson

    Message Count:
    153
    Location:
    Los Angeles, CA
    The only test, to my knowledge, that was restricted to CPU performance only was the R3D decode. And the machine tested did not out perform the previous tower, but came admirably close for half the CPU cores.

    In all other instances, again, per my understanding, all tests were a combination of CPU/GPU performance... Which makes the data compelling, but as Jake stated, less than an apples to apples comparison... But to be fair, I don't see there being a way to do a true apples to apples comparison. The machines are just too different for that. In my opinion.
  13. Jason Myres Moderator

    Message Count:
    2,321
    Location:
    Los Angeles
    From it's internal SSD. The max read data rate for playback was the Sony F65 at 246 MB/s, max write data rate was the ProRes4444 in the HD encoding test at around 30 MB/s, so it was no problem for either machine.
    Eric B Johnson likes this.
  14. Jason Myres Moderator

    Message Count:
    2,321
    Location:
    Los Angeles

    Me, too. I think the big question for the D700s will be how capable they are for 4K work.
    Eric B Johnson likes this.
  15. Jason Myres Moderator

    Message Count:
    2,321
    Location:
    Los Angeles
    1200MB/s for the 2013 and 450MB/s for the 2010

    I agree. This is where a couple of PCI-E slots would have really helped.
  16. Jason Myres Moderator

    Message Count:
    2,321
    Location:
    Los Angeles
    Debayer is one example where the 3.5GHz clock on the 2013 6-Core is helping, but what's interesting is that Mavericks isn't. If you take a look at all of the scores the best overall R3D debayer performance is on the 2010 running 10.8.4. (Jan 20th: 12-Core D700 scores added, so this is no longer true.)

    Assuming you need at least Half Good to view R3Ds on a large display without artifacts, the 2010/ 10.8.4 combination is the only one that can maintain 24fps at 4K. Neither machine can playback EPIC or DRAGON at Half Good in real time. If you're interested in a 5 or 6K RED Raw workflow, you're probably looking at a RED Rocket-X along with an 8- or 12-Core 2013 Mac Pro.

    I'll see if we can do something about that.
  17. Jason Myres Moderator

    Message Count:
    2,321
    Location:
    Los Angeles
    CPU-wise, I actually think the 2010 12-Core running 10.8.4 is the winner here. However, GPU-wise the D500s are pretty impressive, and are a much better example of what's possible with Resolve 10 running OpenCL than prior AMD cards were. A lot of that has to do with the guys at BMD. That being said, you can't add more if you need them, and a 2010 Mac Pro running two GTX 780s or even a GTX 690, probably would have won more of the GPU-bound tests.

    The spirit of the test wasn't to declare a winner, but more to compare a configuration we know well with one we don't know much about. I did try to find a couple of Radeon 7970s locally, but it's a two year old card now, and they just aren't readily available. Also, using 7970s (or any AMD GPU) probably would have muddied the water more than clarified it, as fewer people have experience with them. We all have a good understanding, performance-wise, for the 2010 Mac Pro and Nvidia family tree. We need to begin establishing that with the 2013 Mac Pro and AMD, and a good way to do that is a comparison with what we're already familiar with.

    Totally agree. Unless you're up for a science experiment, I would tip-toe in at this point.
    Eric B Johnson likes this.
  18. jake blackstone

    Message Count:
    690
    Location:
    Los Angeles
    My point too wasn't about finding the winner. All I was trying to say, is it would be more informative to compare two GPUs to two GPUs for processing, even if one pair was ATI and the other nVidia. Comparing 2 GPUs to 1 GPU is not as helpful or informative, as this test could have been. For many inexperienced users this important point may escape their calculation for the system choice.
  19. pedoussaut gilles

    Message Count:
    1
    Location:
    toulouse
    Very interesting test, thank you...
    This is exactly what i was waiting for.
    After 5 days of training on resolve, I want to change my old MacPro and I hesitated just between the new mac pro and a used 5.1 with a gtx 780 gpu card types...
    I have read somewhere That the lite version of resolve work only with one card for gpu
    I suppose That the test has been made with à full version of resolve.
    CAN u tell us please.
    Thank you
  20. Joseph Mastantuono

    Message Count:
    218
    Location:
    Brooklyn
    Those numbers look pro enough for most systems applications. Very curious about the Red Debayer speeds on the 12 core model, but overall, this is better performance than I was expecting.

    Thanks for the tests, Jason super helpful!
    Eric B Johnson likes this.

Share This Page