Unknown Unknown Author
Title: Benchmarked: Ashes of the Singularity
Author: Unknown
Rating 5 of 5 Des:
The Politics of DX12 Last week, Oxide gave press early access to their pre-beta version of Ashes of the Singularity . If you’ve been hidin...

Ashes of the Singularity Logo2

The Politics of DX12

Last week, Oxide gave press early access to their pre-beta version of Ashes of the Singularity. If you’ve been hiding under a rock, here’s why this is important. So far, the only DX12 benchmarks anyone has been able to run have been synthetic in nature—the 3DMark API Overhead test pounds the GPU with draw calls until the GPU hits its limit, and gives a score; the earlier Star Swarm benchmark (formerly of AMD Mantle fame) was sort of in the same situation, except Star Swarm was a lot closer to being an actual game. And that game will be Ashes of the Singularity.

Now, the first thing to get out of the way is that Ashes of the Singularity sports an AMD Gaming Evolved logo, meaning they’re actively receiving help and promotion from AMD. This is nothing new, as we’ve had plenty of Nvidia The Way It’s Meant To Be Played (TWIMTBP) titles over the years, including Batman: Arkham Knight (and all the other Arkham games), The Witcher 3, Assassin’s Creed: Unity, Far Cry 4, the Borderlands series, and the Metro series, to name a few. On the AMD side, we have plenty of options as well: Tomb Raider, Civilization: Beyond Earth, Hitman: Absolution and its upcoming sequel, the recent and upcoming Deus Ex titles, Dragon Age: Inquisition, and most of the DiRT series. We list these merely to show that there are many games that are promoted by AMD or Nvidia, but usually not both; you’ll also note that we’re pretty evenly split on the games we’re currently benchmarking for GPU reviews. But the short summary is that titles with an Nvidia logo are often better optimized—particularly near launch—for Nvidia GPUs, and likewise AMD titles are often better optimized for AMD GPUs. Capiche?

This discussion of AMD backing also becomes pertinent when we get to looking at performance. Nvidia contacted the press after the Ashes of the Singularity benchmark went out to point out that anti-aliasing was running sub-optimally on the DX12 path with Nvidia GPUs, and they recommended we test with AA disabled. Developer Oxide responded with a blog post titled The Birth of a New API, saying that the DX11 and DX12 MSAA is “essentially unchanged.” And here’s where things get a bit sticky. Potentially, AMD has some hardware features that would enable a developer writing DX12 code to have better MSAA performance compared to DX11 code; Nvidia GPUs may or may not be able to do the same thing.

Getting even deeper into the fundamentals of DX11 vs. DX12 programming, under DX11 there were a lot of things that could be done in the GPU drivers to try to optimize performance. With DX12 being a low-level API, most of the driver tweaks are not possible; instead, it’s up to the software developers to write optimized code to extract maximum performance from the various GPUs. In a sense, it’s like giving game developers the ability to write assembly language for the GPU rather than programming in C++, though it should be noted that DX12 is still a higher-level language. Regardless, if a developer is going to extract maximum performance from a GPU, they’ll need to optimize their code for that GPU—and code optimized for one GPU’s architecture may not run optimally on a different architecture! In a worst-case scenario, a developer might need to have different code paths for AMD Fiji, Hawaii, Tonga, Tahiti, etc., and Nvidia Maxwell 2.0, Maxwell 1.0, Kepler, Fermi, etc., architectures.

All of this is further compounded by the fact that Ashes of the Singularity is currently “pre-beta,” though the official beta should be starting very soon. The beta stage is often where a lot of performance optimizations and fine tuning takes place, so looking at performance right now is, at best, a preview of what may or may not come to pass.

We can argue about whether Oxide and Nvidia are being fully transparent, but that’s sort of beside the point. The reality is that DX12 is supposed to be a low-level API that will allow the game developers to extract more performance from the hardware, which means better graphics and hopefully better gameplay will be possible. Or put another way, at the very least, DX12 performance should never be lower than DX11 performance; if it is, something is wrong with the code and the developer should look to fix things. The rumor is that Nvidia put a lot of effort into their DX11 drivers for Ashes, and didn’t do much to help with DX12 optimizations for their hardware, but that’s mostly speculation. What we do know is that DX12 with MSAA enabled does in fact tend to run slower on Nvidia GPUs than the DX11 code, and that’s a clear problem. Ultimately, we opted to run all testing without MSAA enabled; when the game officially launches, we can revisit the subject.

But who cares about all the political stuff going on behind the scenes?! We’re still looking forward to DX12 games and we want to know as much as the next guy what DX12 can do for performance, graphics quality, etc. All the above caveats aside, how does the current pre-beta release of Ashes run on the various GPUs? That’s what we attempted to find out, which entailed running the benchmark many, many times.

Let me tell you, there’s no better way to make someone hate a game than to have them watch the same sequence over and over again! Over three minutes per test gets to be quite lengthy, and we tested no fewer than three resolutions, three quality settings, two CPU clock speeds, four thread settings, and two graphics cards, plus looking at DX11 and DX12.(If you want the math: 3 * 3 * 2 * 4 * 2 *2 = 288. That’s 14.4 hours (minimum!) of running the same three minute sequence. Thank goodness for scripting….

Asus Strix R9 Fury
Team Red: ASUS Strix R9 Fury

Evga Gtx 980 Ti
Team Green: EVGA GTX 980 Ti ACX 2.0

In the interest of keeping the number of charts to a minimum (inasmuch as sixteen charts is a “minimum”), we’re only showing the Low and High quality presets, again with AA disabled on the High preset. Medium quality, as you’d expect, ends up falling between the two and is thus not really necessary, but if anyone wants those charts as well, let us know. We’ve grouped the charts according to the test GPU, with differing numbers of threads, resolution, and DX11/DX12 on each GPU. At the time of testing, we were somewhat limited in terms of what GPUs we had available, so we tested with an EVGA GTX 980 Ti (factory overclocked) and the Asus Strix R9 Fury. Note that this isn’t an AMD vs. Nvidia performance test, but rather a look at how each vendor scales—or doesn’t scale!—with the various settings/features.

Ashes of the Singularity Heavy
Heavy batch of draw calls incoming!

You Take the High Road…

Starting with the high-quality setting, we’re looking at the full average FPS for the entire benchmark. Oxide actually breaks things up into Normal, Medium, and Heavy batches, as the number of draw calls for the test scenes can vary quite a bit, but if we wanted to report those figures we’d need to do another 48 charts. And as much as we like charts, that’s overkill, so no thanks. Anyway, the average FPS correlates pretty well with the Medium batch results, and that makes sense: Normal has fewer calls, Heavy has more, and the overall average is close to Medium. We’ll also look at the 97 Percentile FPS, which is a good indication of whether a game stutters at times.

Maximum PC 2015 GPU Test Bed
CPU Intel Core i7-5930K
@4.2GHz Overclock
@2.1GHz Underclock
Mobo Gigabyte GA-X99-UD4
GPUs EVGA GeForce GTX 980 Ti ACX2.0
Asus Strix R9 Fury
SSD 2x Samsung 850 EVO 250GB
HDD Seagate Barracuda 3TB 7200RPM
PSU EVGA SuperNOVA 1300 G2
Memory G.Skill Ripjaws 16GB DDR4-2666
Cooler Cooler Master Nepton 280L
Case Cooler Master CM Storm Trooper

Our test system is the same as we normally use for GPU tests, except we ran it overclocked at 4.2GHz as well as underclocked at 2.1GHz. For the multi-threaded testing, we used a command-line parameter for Ashes rather than actually disabling/enabling cores in the motherboard BIOS; unfortunately, that only seems to have partially worked, as the two-thread and four-thread results don’t seem to change much. Given the preliminary nature of the testing, we’ll go with what we have for now, but most likely we would see better scaling if we had physically turned off two cores and disabled Hyper-Threading rather than just telling Ashes to run with four threads.

Amd R9 Fury 4.2ghz Ashes Singularity High Avg

Amd R9 Fury 2.1ghz Ashes Singularity High Avg

Amd R9 Fury 4.2ghz Ashes Singularity High 97 Percentile

Amd R9 Fury 2.1ghz Ashes Singularity High 97 Percentile

Starting with the AMD results, right away we find some interesting stuff going on. Using DX12 with a 4.2GHz processor is enough to basically max out the R9 Fury. It doesn’t matter if we use two, four, six, or 12 threads: performance is nearly identical. Alternately, having six or 12 threads with a 2.1GHz processor also delivers nearly the same level of performance. Here’s where DX12 is going to do AMD a ton of favors, at least in the CPU/APU arena, as it looks like four physical cores at a moderate clock (3GHz) should make the GPU the primary bottleneck (though multi-GPU configurations might still want more CPU power). Performance on the 2.1GHz processor improves by over 25 percent going from two to 12 threads (granted, no one is likely to be running a 12-thread 2.1GHz CPU). Perhaps more telling is that at 2.1GHz, DX12 is able to improve AMD’s performance by 35–65 percent over DX11; even at 4.2GHz, DX12 still boasts a 15–35 percent improvement.

Of course, part of the reason for the above improvements is due to the poor DX11 results. We’ve heard Nvidia put a lot of effort into their DX11 performance, but by contrast it looks like AMD has made virtually no effort to deliver good DX11 performance with Ashes. Threads don’t matter under DX11 either, as there’s little difference in performance under DX11, regardless of the number of threads, even at 2.1GHz. The 4.2GHz processor shows at most a five percent increase going from two threads to 12 threads, while the 2.1GHz processor shows at most a 10 percent improvement. The change in clock speeds does help, of course: the 4.2GHz CPU is up to 30 percent faster than the 2.1GHz CPU, though at higher resolutions the margin of victory narrows.

Nvidia Gtx 980ti 4.2ghz Ashes Singularity High Avg

Nvidia Gtx 980ti 2.1ghz Ashes Singularity High Avg

Nvidia Gtx 980ti 4.2ghz Ashes Singularity High 97 Percentile

Nvidia Gtx 980ti 2.1ghz Ashes Singularity High 97 Percentile

Flipping over to the Nvidia side of things, it’s a completely different story. DX12 helps performance… sometimes; other times, it’s worse than DX11. This is without MSAA, which apparently further exacerbates the situation. Remember what we said earlier about software optimizations vs. driver optimizations? It looks like Nvidia’s 980 Ti is currently running DX12 code tuned for AMD hardware, which in many instances is unable to match Nvidia’s highly tuned DX11 driver performance. We might even go so far as to say that Nvidia set the DX11 bar really high, and Oxide failed to clear it—at least right now.

Digging into the details, what’s interesting is that unlike AMD, Nvidia shows clear performance scaling with more threads with the lower clocked CPU. DX11 performance improves by 20–30 percent (depending on resolution) going from two to 12 threads, and DX12 performance improves by up to 40 percent. However, 4K performance is actually lower under DX12 than under DX11. Crank up the CPU clocks to 4.2GHz and threads become less of a factor; at best we see a 10 percent increase at 1080p under DX11, but at higher resolutions the 980 Ti becomes the bottleneck.

If you’re wondering why Nvidia may have a bone to pick with Oxide, at 4.2GHz their DX11 mode outperforms DX12 mode across all resolutions and thread counts. Oops. Again, since DX12 is a low-level API, it’s up to the software developers to optimize their code for different hardware. Oxide notes in their blog post, “Some optimizations that the drivers are doing in DX11 just aren’t working in DX12 yet. Oxide believes it has identified some of the issues with MSAA and is working to implement workarounds on our code.” In other words, Nvidia’s optimized DX11 drivers are doing a better job at certain things right now than Oxide’s DX12 code—but Oxide is working to fix that.

We haven’t said much about the 97 percentile results yet, but the story there is much the same. Nvidia with the 4.2GHz CPU delivers similar minimum FPS, regardless of DX11/DX12 or the number of CPU threads. At 2.1GHz, however, DX12 does make a sometimes sizable difference—the 1080p results with 12 threads are nearly 50 percent higher than the DX11 results. For AMD, DX11 minimums are horrific: well under 20fps. There’s no scaling with CPU threads on DX11, but DX12 in turn delivers a great showing: the 12-thread 1080p DX12 performance is up to 2.5X higher than the DX11 performance on a 2.1GHz CPU. Having a 4.2GHz CPU helps some, but DX12 still shows nearly a doubling of minimum FPS at 1080p, a 75 percent boost at 1440p, and a still-hefty 50 percent increase at 4K.

So far, we’ve avoided making direct comparisons between the two GPUs, as they’re not in the same price bracket. However, if we take it as a given that the EVGA GTX 980 Ti is roughly 20 percent faster than Asus R9 Fury (that’s what our earlier testing showed), it looks like AMD’s Fury X may hold a slight performance advantage over 980 Ti in DX12 mode. But we need to balance that against how badly AMD does in DX11. The R9 Fury in DX11 mode is pretty clearly running into CPU bottlenecks, even at 4.2GHz, but these bottlenecks are far lower than on Nvidia’s hardware. Ergo, AMD’s DX11 drivers are not nearly as efficient as Nvidia’s DX11 drivers—this is something many people have noticed over the past several generations of hardware.

Ashes of the Singularity Medium
Keep those draw calls in moderation, soldier!

I’ll Take the Low Road

That takes care of performance at the High settings, but what if we drop the quality? We’ll skip over most of the analysis, as the story doesn’t change too much from the above—and most people owning a 980 Ti or R9 Fury aren’t going to be running low-quality settings in the first place! Here’s a repeat of the above charts, only now we’ve dropped the rendering quality.

Amd R9 Fury 4.2ghz Ashes Singularity Low Avg

Amd R9 Fury 2.1ghz Ashes Singularity Low Avg

Amd R9 Fury 4.2ghz Ashes Singularity Low 97 Percentile

Amd R9 Fury 2.1ghz Ashes Singularity Low 97 Percentile

Nvidia Gtx 980ti 4.2ghz Ashes Singularity Low Avg

Nvidia Gtx 980ti 2.1ghz Ashes Singularity Low Avg

Nvidia Gtx 980ti 4.2ghz Ashes Singularity Low 97 Percentile

Nvidia Gtx 980ti 2.1ghz Ashes Singularity Low 97 Percentile

Not surprisingly, the reduction in graphics fidelity has made Ashes more CPU bottlenecked. The biggest change is that frame rates are higher, naturally, but even at low quality we still see a decent amount of scaling on AMD hardware going from DX11 to DX12. In fact, the improvement is even greater this time, with up to 90 percent improvements at 2.1GHz and 60 percent at 4.2GHz. Nvidia also shows better performance across all settings with the 2.1GHz CPU, but 4K with the 4.2GHz processor still shows a performance drop of up to 10 percent.

As for 97 percentile frame rates, again we have to look at Nvidia and AMD separately. For Nvidia, there appears to be a wall at around 30fps at 2.1GHz in DX11 mode, and DX12 helps to lift that bottleneck to more than 50fps. With the 4.2GHz CPU, the wall is at 45fps, and DX12 increases that to nearly 70fps. Interestingly, AMD shows similar results under DX12: 50fps with 2.1GHz, 60fps at 4.2GHz. But that darn DX11 performance; 16fps at 2.1GHz and 23fps at 4.2GHz is horrible; there’s no other way to put it.

Ashes of the Singularity Normal
It's quiet... too quiet!

Ashes to Ashes, Dust to Dust

As the first of what will likely be many DX12-enabled titles coming sometime between now and 2016, Ashes of the Singularity is at best a taste of what’s to come. And that taste is… perplexing. Anyone hoping DX12 will mean the end of the GPU vendor wars is sure to be disappointed; if anything, DX12 looks to make the rivalry even more brutal. We’ve seen a few people hailing AMD as the decisive winner of DX12 performance, the problem being that we’re looking at an AMD-promoted title. They should offer better performance than Nvidia on a title they’re promoting, especially at a pre-launch stage. Whether this will reflect future DX12 titles remains to be seen. Unreal Engine, Unity, Frostbite, and a host of other engines will more likely than not differ from Ashes.

Frankly, this testing is really just the tip of the proverbial iceberg. We ran two GPUs at a whole bunch of settings to find out how they performed, and this is in a single game. Does Fury X claim the crown from 980 Ti in this one title? We could answer that question with some additional testing, but that's sort of missing the point. Right now, we can see that DX12 definitely makes a difference in performance, giving the game developers a lot more power. But with great power comes great responsibility, and some developers may not be able to handle DX12, at least not without more time and effort.

The next fight is shaping up to be Lionhead’s Fable Legends, and that will perhaps be a more neutral battleground as it’s neither an AMD nor an Nvidia title. In fact, it appears Microsoft (who owns Lionhead) is determined to put forth a message that DX12 is unified. Microsoft doesn’t want DX12 to appear as a fractured landscape, one where AMD or Nvidia rules, a place where processor graphics gets left in the dust. In that sense, Fable should be the most likely vendor-agnostic approach to DX12 we’re going to see in the near term. We’re certainly looking forward to testing it, though it may be a few months.

Ultimately, no matter what AMD, Microsoft, or Nvidia might say, there’s another important fact to consider. DX11 (and DX10/DX9) are not going away; the big developers have the resources to do low-level programming with DX12 to improve performance. Independent developers and smaller outfits are not going to be as enamored with putting in more work on the engine if it just takes time away from making a great game. And at the end of the day, that’s what really matters. Games like StarCraft II, Fallout 3, and the Mass Effect series have all received rave reviews, with nary a DX11 piece of code in sight. And until DX11 is well and truly put to rest (maybe around the time Dream Machine 2020 rolls out?), things like drivers and CPU performance are still going to be important.

Let's end with some questions. What games are you most looking forward to for the coming year? And will DX12 support—or a lack thereof—affect your buying decisions? Let us know what other games you’re most interested in seeing benchmarked!

Follow Jarred on Twitter.



From maximumpc

from http://bit.ly/1UjdR3d

Advertisement

 
Top