Hi All,
I've again spent a weekend just investigating GC3 performance on my PC. From OCing my cpu, analyzing page faults, and OCing my video card, I really felt that I hit a brick wall. None of my resources seemed to ever be fully utilized, so that begged the question of what "I" could do to make the game go faster. Using Windows Resource Monitor and a free tool from MS called Windows Performance Analyzer answered the questions for me.
NOTE - analysis is from an insane map game, 100 factions, turn 500 or so. WPA capture was performed during a soak run, turn time about 2 minutes. Save game linked below.
Below is a screen shot of what many of you have seen, making you a bit frustrated that your fancy multi-core CPU isnt being utilized fully. Especially frustrating when you watch this graph, because you have nothing else to do in between the few minutes it can take in between turns in mid/late game. (Note, I do overclock, but not in bios, only in windows once I've launched the game, hence windows doesn't display it properly).
So, using WPA, here are the gory details. At first glance, it again looks like all is well, and for some unknown reason, the CPU is "throttling" or could be doing more.
However, when we look at the per thread performance, we see some interesting observations.
One thread is being capped at or near 12.5% CPU, which means it is running at 100% "core" utilization(this is in fact the Graphics handling thread, discussed below). 2 other threads are quite near capping at 12.5% . All of the other threads don't seem to be doing much at all. Sure, we see some nice spikes by a few other threads, but they aren't used often enough, as several seconds is an eternity for a cpu.
When looking at statistics of the threads throughout the 2 minute capture period, we can much better see that the 3 GC3 threads are by far doing the vast majority of processing.
What does this mean for you? Well, I wouldn't go rush off to buy an +6 core machine, as its not going to help you very much. I've done a lot of testing, and with all things being equal, I am seeing identical net performance of GC3 if I have all 8 cores enabled, or only 4. Therefore, the game appears to be best suited for 4 core cpus, 3 for GC3, and 1 core can run windows and the puny GC3 threads (a generalization of core us, affinity won't buy you anything, I've tried it). Was this done on purpose, 4 cores is the best? I have no idea. Having more distributed processing of the 3 major threads (what if they were split into 6, albiet this is far from trivial) would have slight impact to those running 4 cores, however a major improvement to those running 8 cores. It certainly needs to be done soon however, as Intel has caught-up/surpassed AMD's Core capacity, and the future is, well, more and more cores. AMD vs Intel debate aside, larger core processors go hand in hand with higher clockspeed , so if you are considering to get a new cpu and you currently have a 4 cores, only the improved clockspeed will really help you out as of today.
Graphics Efficient Improvements Needed:
This is by far the most inefficient software thread in GC3. Its constantly using max resources, whenever you are on the map. I don't know exactly what it is doing, I would however fully expect it to not be doing much when pointed at a blank space in the map. Here are some of my GPU and Graphics thread observations in various scenarios:
GPU Util. Graphics-Thread Core Util.
Staring at a blank uncharted part of the map: 60% 100%Looking at the global map from far away: 92% 100%Space combat battle viewer: 15% 10%Post battle results screen: 30% 10%Intro video: 8% n/a
Addressing this however will simply only help people that have few cores, below 4. More interestingly, I've seen high GPU and Graphics Thread utilization on tiny maps as well, which leads me to believe something VERY inefficient is going on and this is not a 'matter of scale' for large maps. If optimized, the other two GC3 threads will still be capped by the core limit (I've tried, you can go to a planet or shipyard during your turn, drastically reducing the graphics thread's util., but the turn doesn't run faster). Nevertheless, its certainly unfair to those that have below average graphics cards - which the SD recommendations say to be a suggested HW requirement (it is a TURN based game after all). As well, when a 4th high utilization GC3 thread is eventually developed ( great thing for peeps that have 8 cores), the GC3 graphics thread will cause contention with windows processes and 4 core users will suffer.
I can't speak on exactly what the AI is doing during the turn, but obviously more needs to be done. Its hard to believe that in between turns, the only thing happening is ship combat/moves. If true, it must be going at the speed of watching them fight/move on the map, which obviously is NOT needed if your current map vantage point doesn't show the enemy ships. The same goes for your own ships. If you can't see them, the game needs to be sure to do this processing much faster than it takes to watch the ship travel 10 hexes. I have no evidence to point to this actually occurring, but I suspect this again due to the long turn times in late game where the AI has a large number of ships (one test could be to just use debug console and destroy everyone's ships and see how long the next turn takes ).
Again, I am a huge fan and major believer in the potential of this game, and by no means discount SD's efforts thus far. I simply wanted to share my observations and thoughts on current limitations and future growth.
Game save that was used for WPA analysishttps://www.dropbox.com/s/c7y4j9ixsxvkz6i/Previous%20Auto-Save%20-Original%20turn%20503%20crash.GC3Sav?dl=0
To make similar observations yourselves, go to control panel, admin tools, performance monitor, and add counters for Thread.%processortime. Much more detailed analysis can be done with the Windows Performance Toolkit, avaiable or free at MS's website. It has a recorder and an analyzer. Watch out, the recorder will create HUGE files, my 2 minute capture was 7GB in size (binary), hence why it is NOT on dropbox!
Cheers,Dan
So, I think I have peaked my FX8230. I'm running 'stable' at 5.008GHz. My voltage is a bit shocking, needing 1.625v (up to 1.68v during full 8 core load tests). My temp is at my personal limit of 65C socket. The odd thing is that I do fail cine bench and aida64 FPU tests miserably, but if I don't add FPU based instructions, I can run stable load tests all day on prime95 or aida64. Apparently, the FX8320 and others in the line, have a 'weak as shit FPU'. They need insane amounts of voltage for FPU based instructions (ASX). I can only assume that due to better manufacturing bins, chips like the 8350, 8370, and 9xxxx series will perform with slightly less voltage, hence allowing them to overclock even higher while using insane voltage levels. That last 120MHz from 4.88 to 5.008, required going from 1.53v to 1.625v. 4.7GHz to 4.88 required going from 1.39v to 1.53v! So honestly, I probably won't leave it at 5GHz, unless I can get some speed stepping to work well, while OCed, and not affect peak perf. At least with AMD, I've got nearly all the BIOS controls from the OS, so I won't have to reboot when I want to make a change.
interestingly, this OC journey has led me to discover a problem with 'starting GC3 with a huge number of factions' while overclocked. More on that in another thread.
so, here are my performance results, using the save game from the original post. Note, to do apples to apples comparison, once loaded, I immediately type in soak in debug console. I then go to research and just hit DONE, hit space bar on all idle ships, and shutdown the 3 idle shipyards. I then wait 3 minutes, to make sure the AI has finished what it might have been doing just after load (Hence the first turn will be much shorter than others). At this point, all I have to do is hit TURN, and sit back and watch the clock:
turn Time
495 1:18
496 2:05
497 2:03
498 2:02
499 1:37
500 1:41
501 1:40
502 1:43
503 CTD at 11s into turn
about the only thing my 8 cores are actually good for now is cooling. Because my load at 5GHz is now hardly over 30%, the CPU socket temp is barely getting to 50C with my h100i GT. yeah, I like the water cooler a lot, but it's software is absolute crap. It's quite basic though, so doesn't have to be perfect. The radiator surprisingly never gets warm, and if I disable radiator completely, I only lose 5-6C on the socket temp. So, I basically just keep the fans on minimal speed unless I crank it up in a game. However, BEWARE of installation issues. It was a royal bitch to get my WC installed. Not enough room between mobo and top of the case. In fact had to install my WC Fans on top of the case, leaving the radiator alone in the case. I've read complaints of peeps with corsair cases not being able to fit the corsair WCs, even though corsair said it would fit!! But except for installation issues, it's WELL worth it. I can't make a direct comparison, but I'd estimate that it has dropped temps by 20C under 'normal use, and has opened my OC limit, giving me up to 500Mhz more speed (albeit with voltages that will depreciate the CPU faster, by how much, no one can say exactly). I would say next to an SSD upgrade, this is the best thing you can get to upgrade PC performance.
h100i and 4790k here. I don't overclock because I don't find it useful. The h100i is good, my core stays a solid 25C, although I'm usually not all that intensive with my clockcycles. You might hear some complaints on the "noise" - honestly I never hear a peep, the fans barely ever spool up - the case is positive pressure though so I'm sure there's quite a bit of passive cooling as it blows past the radiator even with the fans barely spinning.
As Dan noted though, I could definitely see size being an issue. I don't know if it really gets that much more cooling power than the h80i, which is probably easier to fit in smaller cases. I have a huge case though, so fitting it was easy-peasy. The one "fitting" issue that some people do run into though, is that often the RAM slots are close enough to the proc that it can't all fit. So, if you're thinking about getting one, it's one of the variables in your setup that you'll want to check and see that someone else online got it to all fit together. In this case, a lot of the time it's the heat sink riser things on the RAM modules that can make the difference - since everyone wants to make their RAM look like a cyberpunked swordfish it seems like..
Back on topic though, it's an interesting read Dan. I've been way too busy to play much (any?) GC3 since it was released - halfway waiting for it to be finished out via expansions/dlc, halfway waiting for my own plate to get lighter. I lurk the forums though and just wanted to say that I liked this thread quite a bit and appreciate the information that you shared. Turn times on insane games is one of the 'glitches' that I've been keeping an eye on.
cheers,
So, after running lots of passmark's performance test benchmarks, I see that OCing is well worth it. However, the AMD (and the Intel as well, I would imagine), that last 100-200MHz more just doesn't seen to be worth it due to fan noise, power consumption, and Increased CPU degredation. I only want 2-3 years for my CPU, but there simply isn't enough data to know what temps and voltages can be set to to make it last 6 months, 3 years, or forever....
i still have my old TRS-80 III, (The only old PC I have bothered to keep) and it boots up like a champ....at 1.774MHz, it sure coulda used OCIng
I highly doubt that unless you are only reporting idle temps. I could see a ΔT of 25C for the h100i but unless your room is kept cold enough to hang meat in it's more likely the software you're using to monitor your temps is reading the sensor incorrectly.
This is just a personal opinion of mine but I don't feel that any of the general AIO coolers are worth the money when just a little bit more gets you a Swiftech H220X that will outperform every comparably sized AIO while making far less noise than any of them. It's also user expandable and fully warrantied that way if you decide to move farther into the water cooling realm by adding a GPU block and an additional radiator to handle the increased heat load.
In my experience turn times bloat up massively as AIs increase the number of ships they have in play. Why we need to wait for every ship movement to play the full animation even when they are off screen or in the fow I'll never know. Thus the more AI you have, and by extension the more AI at war, the longer your turn times. It's particularly nebulous when you are at peace just trying to speed through some basic empire management tasks but have to wait a while to get to your next turn. I've stopped these truly massive scale games in the meanwhile because its just so boring to sit around waiting on the turn timer.
The last little bit isn't really ever worth it in my experience because by that point you are well into diminishing returns in terms of potential performance gain at the cost of everything else. Generally speaking in terms of overclocked hardware you will notice a gradual failure due to electromigration. You'll have to start lowering clock speeds little by little over time until it degrades to the point where it can't hold any OC at all, no matter what kind of voltages you try to push into it. The farther you push it the faster this process happens. Typically a user will have replaced the hardware with something newer by then though because the newer hardware brings much bigger performance gains or new features that are worth the expense.
Those are some interesting numbers, Dan. Mine might not be exactly apples to apples because I'm not familiar with doing a soak test. I just loaded, hit end turn and clicked on whatever popped up.
Your system seems to be able to get through an extra turn. Mine crashed at turn 502 while yours went to 503 before crashing.
I assume that I'd need to make a shortcut and add "cheat" to the end in order to get to the console, bring the console up and enter "soak". I assume that I won't need to click on the end turn button after that?
I might try it some time and repost the results when I get to it.
Can you confirm what I'd need to do for this test?
<edit> I started with a TRS-80 as well (they called it the TRS-80 Color Computer), but returned it for a Sinclair Specturm 48k about a week later. 3.5 MHZ z80 processor. I still have it sans power supply. </edit>
There are many great features available to you once you register, including:
Sign in or Create Account