Hi All,
I've again spent a weekend just investigating GC3 performance on my PC. From OCing my cpu, analyzing page faults, and OCing my video card, I really felt that I hit a brick wall. None of my resources seemed to ever be fully utilized, so that begged the question of what "I" could do to make the game go faster. Using Windows Resource Monitor and a free tool from MS called Windows Performance Analyzer answered the questions for me.
NOTE - analysis is from an insane map game, 100 factions, turn 500 or so. WPA capture was performed during a soak run, turn time about 2 minutes. Save game linked below.
Below is a screen shot of what many of you have seen, making you a bit frustrated that your fancy multi-core CPU isnt being utilized fully. Especially frustrating when you watch this graph, because you have nothing else to do in between the few minutes it can take in between turns in mid/late game. (Note, I do overclock, but not in bios, only in windows once I've launched the game, hence windows doesn't display it properly).
So, using WPA, here are the gory details. At first glance, it again looks like all is well, and for some unknown reason, the CPU is "throttling" or could be doing more.
However, when we look at the per thread performance, we see some interesting observations.
One thread is being capped at or near 12.5% CPU, which means it is running at 100% "core" utilization(this is in fact the Graphics handling thread, discussed below). 2 other threads are quite near capping at 12.5% . All of the other threads don't seem to be doing much at all. Sure, we see some nice spikes by a few other threads, but they aren't used often enough, as several seconds is an eternity for a cpu.
When looking at statistics of the threads throughout the 2 minute capture period, we can much better see that the 3 GC3 threads are by far doing the vast majority of processing.
What does this mean for you? Well, I wouldn't go rush off to buy an +6 core machine, as its not going to help you very much. I've done a lot of testing, and with all things being equal, I am seeing identical net performance of GC3 if I have all 8 cores enabled, or only 4. Therefore, the game appears to be best suited for 4 core cpus, 3 for GC3, and 1 core can run windows and the puny GC3 threads (a generalization of core us, affinity won't buy you anything, I've tried it). Was this done on purpose, 4 cores is the best? I have no idea. Having more distributed processing of the 3 major threads (what if they were split into 6, albiet this is far from trivial) would have slight impact to those running 4 cores, however a major improvement to those running 8 cores. It certainly needs to be done soon however, as Intel has caught-up/surpassed AMD's Core capacity, and the future is, well, more and more cores. AMD vs Intel debate aside, larger core processors go hand in hand with higher clockspeed , so if you are considering to get a new cpu and you currently have a 4 cores, only the improved clockspeed will really help you out as of today.
Graphics Efficient Improvements Needed:
This is by far the most inefficient software thread in GC3. Its constantly using max resources, whenever you are on the map. I don't know exactly what it is doing, I would however fully expect it to not be doing much when pointed at a blank space in the map. Here are some of my GPU and Graphics thread observations in various scenarios:
GPU Util. Graphics-Thread Core Util.
Staring at a blank uncharted part of the map: 60% 100%Looking at the global map from far away: 92% 100%Space combat battle viewer: 15% 10%Post battle results screen: 30% 10%Intro video: 8% n/a
Addressing this however will simply only help people that have few cores, below 4. More interestingly, I've seen high GPU and Graphics Thread utilization on tiny maps as well, which leads me to believe something VERY inefficient is going on and this is not a 'matter of scale' for large maps. If optimized, the other two GC3 threads will still be capped by the core limit (I've tried, you can go to a planet or shipyard during your turn, drastically reducing the graphics thread's util., but the turn doesn't run faster). Nevertheless, its certainly unfair to those that have below average graphics cards - which the SD recommendations say to be a suggested HW requirement (it is a TURN based game after all). As well, when a 4th high utilization GC3 thread is eventually developed ( great thing for peeps that have 8 cores), the GC3 graphics thread will cause contention with windows processes and 4 core users will suffer.
I can't speak on exactly what the AI is doing during the turn, but obviously more needs to be done. Its hard to believe that in between turns, the only thing happening is ship combat/moves. If true, it must be going at the speed of watching them fight/move on the map, which obviously is NOT needed if your current map vantage point doesn't show the enemy ships. The same goes for your own ships. If you can't see them, the game needs to be sure to do this processing much faster than it takes to watch the ship travel 10 hexes. I have no evidence to point to this actually occurring, but I suspect this again due to the long turn times in late game where the AI has a large number of ships (one test could be to just use debug console and destroy everyone's ships and see how long the next turn takes ).
Again, I am a huge fan and major believer in the potential of this game, and by no means discount SD's efforts thus far. I simply wanted to share my observations and thoughts on current limitations and future growth.
Game save that was used for WPA analysishttps://www.dropbox.com/s/c7y4j9ixsxvkz6i/Previous%20Auto-Save%20-Original%20turn%20503%20crash.GC3Sav?dl=0
To make similar observations yourselves, go to control panel, admin tools, performance monitor, and add counters for Thread.%processortime. Much more detailed analysis can be done with the Windows Performance Toolkit, avaiable or free at MS's website. It has a recorder and an analyzer. Watch out, the recorder will create HUGE files, my 2 minute capture was 7GB in size (binary), hence why it is NOT on dropbox!
Cheers,Dan
As I put in your +1 Karma note - Outstanding info!
I might be missing it, but what graphics card(s) are you using?
Radeon HD7770. Considered a midrange card when I bought it 18 month ago. Nowadays, not so much, but I would expect it to handle this game properly.
Thanks! When I get home (and after chores and such), I'll grab the WPA and your savegame and run a turn. My machine is on the opposite spectrum as yours as far as CPU cores. i3, so only 2 cores. It'll probably report 4 cores cue to hyperthreading. We'll see tonight.
32GB of RAM, so it should be able to run your map. Again, we'll see.
My graphics card is a 2GB EVGA GTX 760 SC. A fairly close match to yours as far as speed.
Time to dust off my stopwatch app.
Be aware this this save will absolutely crash on turn 503 (I've used this save for LOTS of benchmarking). I'm gonna open a ticket shortly, so please let me know if you have the same crash as well. Thx.
If his core I3 runs that turn faster than our FX 8cores do Dan I will scream!
With 32GB of RAM, he just might With that much RAM, I'd create a ramdisk and disable VM, and get the HD completely out of the loop.
Actually, I tried that when I first upgraded the memory. There was no discernible difference in speed between the RAMDisk and the SSD. I'm sure the RAMDisk was faster in absolute terms, but I didn't notice a difference in the real world. I did a backup through Steam to my HDD and that took quite a while. Transferring from SSD to RAMDisk was pretty quick, as was going back, but what if I had a power glitch? I'd have to restore from HDD and lose whatever turns I didn't have on the backup. The RAMDisk just isn't worth it to me.
I'll be home in an hour or so, then I need to take care of a few things and will start the test. Looking forward to it.
I'm not familiar with WPA/WPR, but I did a recording session with the game. I'm not clear on precisely what the numbers mean, but the graphs seem to indicate that both cores and all 4 threads are being utilized pretty well.
I overlaid my CPU-Z report so you can do an apples to apples thing.
I had to interact with the game on some turns, which would add a few seconds to the time. As you said, the game crashed to desktop between turns 502 and 503.
Turn times were as follows:
TURN Min:Sec # of mouse clicks needed to get through turn.
495 1:49 2
496 1:52
497 1:52 1
498 1:40
499 1:21
500 1:29 1
501 1:23 3
502 CTD @1:04
I shoulda bought Intel - hahah very nice!!! I think your RAM certainly paid off around turn 499. I'm going to be 'upgrading' my PC tomorrow, and I'll run anothe comparison to see if I can catch up
Well rats! The saved game will not load on my computer (Razer Blade 2014), it does not max out memory but starts the loading process then fails to desktop.
Yeah, he's doing an insane sized map with 100 factions. I doubt it would fit into 8GB RAM. Is the Razer Blade upgradable? I'd think 12 GB would do it. 16 for sure.
My laptop costs too much to reasonably upgrade. It uses the low power RAM. I'm stuck with the smaller map options and fewer factions on that.
I can confirm. I have run this on a RAM disk as well, there is no appreciable difference between running it on a SSD and RAM Disk. The fact that they have been finding bone headed lengths of time the cpu is doing things instead of clean fast efficient code supports the supposition they dumped release way too early. The issue we've found with the cpu is that it doesn't seem to be using all of the available crunch power of the cpu cores to get all this inefficient checking done rapidly. These turns should be taking 2-5 seconds each instead of 2-5 minutes. Nuff said.
Actually, I tried that when I first upgraded the memory. There was no discernible difference in speed between the RAMDisk and the SSD. I'm sure the RAMDisk was faster in absolute terms, but I didn't notice a difference in the real world. I did a backup through Steam to my HDD and that took quite a while. Transferring from SSD to RAMDisk was pretty quick, as was going back, but what if I had a power glitch? I'd have to restore from HDD and lose whatever turns I didn't have on the backup. The RAMDisk just isn't worth it to me.I'll be home in an hour or so, then I need to take care of a few things and will start the test. Looking forward to it.
You can set your RAM drive to automatically dump it's contents daily to the SSD which is what I do for safety. Were you doing it manually being that it is RAM and an SSD this would only take a minute to do manually.
I'd be interested to see your cpu result and patterns of behavior on your intel system versus your FX. What intel cpu did you pick up Dan?
Yes, I was doing it manually, but that wasn't the issue. I didn't want to lose any game time due to a crash where I'd need to restore from HDD. If I move the game from the SSD to the RAMDisk, it moves rather than copies. If I have a power glitch or one of my dogs steps on the power cord, I could lose everything up till the last backup.
Since I couldn't tell the difference in speed, it's not worth it to me to use a RAMDisk.
Yes, I was doing it manually, but that wasn't the issue. I didn't want to lose any game time due to a crash where I'd need to restore from HDD. If I move the game from the SSD to the RAMDisk, it moves rather than copies. If I have a power glitch or one of my dogs steps on the power cord, I could lose everything up till the last backup.Since I couldn't tell the difference in speed, it's not worth it to me to use a RAMDisk.
I wasn't clear, dump a complete copy of the RAM disk as it while it is still on the disk, I never move it off of the RAM disk unless I need space. For anybody with a later model PC, consider getting yourself up to 32GB RAM RAM disk is pretty sweet, and they price of RAM is so damn cheap,
As for your reasons for not playing GCIII from the RAM drive instead of the SSD, I can totally understand, because right now it doesn't offer any benefit. I agree.
Yeah, he's doing an insane sized map with 100 factions. I doubt it would fit into 8GB RAM. Is the Razer Blade upgradable? I'd think 12 GB would do it. 16 for sure.My laptop costs too much to reasonably upgrade. It uses the low power RAM. I'm stuck with the smaller map options and fewer factions on that.
That could be but since memory usage during the load only got as far as 6gb total before it craps out I suspect it is something else afoul. Oh well.
I'll run some testing on this later to give you another data set. I've got a 4690k, 16GB of RAM, and a GTX970.
E: Well I was going to run some testing but I can't get it to load. It just throws off a CTD about 20 seconds into the loading. Not really sure what the cause is. It's definitely not an out of memory crash since physical memory usage doesn't even reach 8GB (from a baseline of 2GB). Perhaps its crashing because I don't have all the custom races used? Possibly I guess.
i shoulda, but I didn't . Yesterday I bought a new motherboard for my fx8320, one much more suitable for over clocking (asus mx97r2 -> asus sabertooth r2). I also picked up a corsair h100i gx water cooler. I'm now at 4.7ghz stable under full load with prime 95, and I can run that GC3 save file at 4.925GHz, as I'm not using my cores simultaneously. I'll post timings later. I think I can squeeze maybe 100-200MHz more, I'm still getting used to the immense additional controls on this BIOS.
HOLY SHIT FX-8320 at 4.925GHZ Talk about hitting WARP 5 Enterprise, don't blow off the nacelles there Tucker!
Dan, How do you like the h100i?
I don't have the custom races installed either. Are you running W7 or W8? I'm running W7, if that makes any difference.
I'm hoping you can get it to load. I'd like to see how that GTX970 performs.
I have a 970 but it doesn't really seem to be needing the resources.
As far as the core number goes, you should remember that current AMD architecture makes use of modules and "simplified cores".
It means that your 8 cores processor is in fact a 4 modules processor.
Each module has 2 cores but some functionalities are shared for these cores mostly some cache and more importantly I think for games, the FPU (floating point processing unit) is part of the module not the cores.
So even though you have 8 cores, you only have 4 fpus and as it turns out, we use a lot of floating point calculations in game development
p.s. and btw AMD has completely abandoned this kind of architecture for their next product line to come in 2016 where they go back to using fully fledged cores.
That is all fine and dandy and but it means nothing to the task manager or every other software program in existence. Handbrake treats each core like an intel core, MS FSX treats each core like its own core. The architecture is pretty good and the reason they are moving on it a combination of design paradigms lower power consumption, HBM, and SMT instead of CMT, none of that explains why GCIII doesn't utilize those cores properly.
IPC of Intel is undisputed higher per core, however that doesn't make the IPC of the AMD cores pathetic just lower, and when you have more cores it still is competitive in programs that take advantage of more cores.
Brad has said GCIII will take advantage of every core you throw at it, when in fact it doesn't appear that is the case, in fact there doesn't appear to be much truth to a lot of GCIII's features.
I don't have the custom races installed either. Are you running W7 or W8? I'm running W7, if that makes any difference. I'm hoping you can get it to load. I'd like to see how that GTX970 performs.
I'm running 8.1. The 970 is massively overkill for this. It's overkill to the point where it never even runs at the max overclock I set because it doesn't need to to hit the 62.4fps max framerate for Nvidia's adaptive v-sync driver setting on a 60Hz refresh rate panel. I can't recall ever seeing GPU utilization above 60% and I don't think the core has ever ramped up to its max clock speed ever.
There are many great features available to you once you register, including:
Sign in or Create Account