Topic: Optimization (Read 7739 times)

lordlp · « **on:** August 31, 2012, 03:52:39 pm »

I know discontent rarely is met with tea and crumpets here, but when will optimization for this game become a priority?

Always loved Dwarf Fortress but the biggest turn off is the downward spiral of fps – eventually leading to an unplayable game. Design wise, there are things that can be done to improve the optimization, which I think is more of a priority then adding more ‘stuff’.

Just my opinion as one of the biggest fans, I’m having a very hard time getting back into this game due to these issues.

Telgin · « **Reply #1 on:** August 31, 2012, 04:24:14 pm »

Not any time soon, most likely.

Toady seems content to work toward making the game more complete first, and although I have no doubt he optimizes as he goes along, he probably keeps this mostly restricted to the code he's working on, not older stuff (which probably contributes more to FPS death).

On one hand this makes sense: if he's going to go back and change existing code when new features are added, it doesn't make a lot of sense to 'waste' time optimizing it heavily. So I imagine his intention is to optimize features heavily once they are more or less complete. Unfortunately the nebulous concept of complete here is something I doubt even Toady can agree on.

In particular Toady has stated at least once that he's probably not going to multithread DF any time soon, if ever. Which is a shame, but without knowing how the game works internally I have no idea if it could be done with any level of relative ease.

Another thing to consider is that aside from the lack multithreading, I'd say that DF is remarkably efficient. It's able to get playable FPS with hundreds of units and many thousands of items to keep track of. It's kind of hard to guess how this could really be improved without a significant (although maybe acceptable for some) loss of precision.

AutomataKittay · « **Reply #2 on:** August 31, 2012, 04:29:55 pm »

IIRC, a lot of the issues in 'late' fort is number of deaths cluttering up unit checks (I saw a thread a while ago that researched FPS effects finding too many dead stuffs lags ), then there're pathfinding, temperature checks and amount of clutter on map. I've seen people that doubled their performance with carefully designed pathing and turning off temperature.

I don't think there're much that can be 'simply' optimized in DF, considering it's sheer complexity down to how many degree urist your stone is as it falls into magma and if that right lung's about to be spiked by a bolt.

WealthyRadish · « **Reply #3 on:** August 31, 2012, 08:09:51 pm »

I think the long term optimization strategy is for computing technology to try to keep pace with DF's development. In 20 years when it's finished, hopefully FPS death won't be much of an issue.

If you're a bit more, um, impatient, several mods and tricks can improve FPS. Turning off temperature, consolidating the item material types, reducing population and dug out space, and destroying unneeded items are the big ones.

i2amroy · « **Reply #4 on:** August 31, 2012, 09:03:52 pm »

Check out the Maximizing Framerate page on the wiki for all sorts of nifty tricks that can speed up your game, some of which require changing the game itself, and others that are just small things you can do to provide benefits.

And, as UrbanGiraffe mentioned, there are also a few mods out there that allow for some fairly large FPS increases at the cost of diversity.

Pagan · « **Reply #5 on:** September 01, 2012, 04:07:42 am »

Masterwork mod has a significant boost in FPS from reducing the amount of diversity of things.

One thing I dont agree with from the page on maximizing frame rates though is this one:

Quote

While it's best to run DF with no other programs in the background if FPS is an issue, giving the priority a bump or two can help speed things up regardless.

I often have dwarf fortress running in the background while I'm producing music. My DAW running has zero effect on dwarf fortress and vice versa, even though the DAW can use up tons of resources. I've noticed this with new forts and old established ones. Personally I think the FPS loss is from the amount of read/write being sent to RAM, eventually there are just too many things to check.

CaptApollo12 · « **Reply #6 on:** September 01, 2012, 04:50:26 am »

Print mode standard increased my FPS twofold or more as compared to 2d. I just found this and thought i would share it.

Zarat · « **Reply #7 on:** September 01, 2012, 01:48:37 pm »

Quote from: UrbanGiraffe on August 31, 2012, 08:09:51 pm

I think the long term optimization strategy is for computing technology to try to keep pace with DF's development. In 20 years when it's finished, hopefully FPS death won't be much of an issue.

If you're a bit more, um, impatient, several mods and tricks can improve FPS. Turning off temperature, consolidating the item material types, reducing population and dug out space, and destroying unneeded items are the big ones.

No matter how far computing technology advances, O(n^2) algorithms will still be O(n^2).

It's not going to get much better.

The problem is pretty evidently algorithmic inefficiency and the general design. Hopefully one day Toady'll just rework the whole thing. About the only chance we've got. Multithreading at this point could be done, probably without too much difficultly depending on what you thread, but it'd be like a bandaid on a gunshot wound.

EDIT: Heck, if he was going to multithread anything, it'd be nice if he'd split Save off. It's a huge annoyance to save a game and then have an unresponsive UI for the next couple seconds. Would be super easy, wouldn't touch very much code, wouldn't take very much time, and it'd be ultra convenient.

Coolthulhu · « **Reply #8 on:** September 01, 2012, 02:16:23 pm »

Quote from: UrbanGiraffe on August 31, 2012, 08:09:51 pm

I think the long term optimization strategy is for computing technology to try to keep pace with DF's development. In 20 years when it's finished, hopefully FPS death won't be much of an issue.

CPUs pretty much stopped advancing when it comes to raw, single core power. Until DF becomes multithreaded, the difference between P4 2ghz and Core 20 Cent 100x6ghz will pretty much boil down to 2ghz vs 6ghz.
The worst offenders (temperature, fluids) are actually among the easiest to multithread. In fact, they could even run mostly on GPU with relatively little work.

They Got Leader · « **Reply #9 on:** September 01, 2012, 02:25:45 pm »

I have noticed a bit of optimization since the last version I played (which was roughly a year or so ago). I am sitting with 70 dwarves and a whole lot more FPS than I used to.

Kipi · « **Reply #10 on:** September 01, 2012, 02:30:07 pm »

Quote from: Coolthulhu on September 01, 2012, 02:16:23 pm

The worst offenders (temperature, fluids) are actually among the easiest to multithread. In fact, they could even run mostly on GPU with relatively little work.

And you know this... how? Have you perhaps seen the source code? We have no way of knowing how those things (especially temperature, which is affecting everything) works and when/how the checks are made.

Also, you must remember that Toady has stated that he has no experience in multithreading. So, to gain some speed which may very well be minimal, he has to spend time to learn multithreading. And from experience I can say that it's not easy thing to do, and I'm studying IT in university.

Telgin · « **Reply #11 on:** September 01, 2012, 03:05:46 pm »

Quote from: Kipi on September 01, 2012, 02:30:07 pm

Quote from: Coolthulhu on September 01, 2012, 02:16:23 pm
The worst offenders (temperature, fluids) are actually among the easiest to multithread. In fact, they could even run mostly on GPU with relatively little work.

And you know this... how? Have you perhaps seen the source code? We have no way of knowing how those things (especially temperature, which is affecting everything) works and when/how the checks are made.

Also, you must remember that Toady has stated that he has no experience in multithreading. So, to gain some speed which may very well be minimal, he has to spend time to learn multithreading. And from experience I can say that it's not easy thing to do, and I'm studying IT in university.

I'll second this.

I have no doubt that things like pathfinding or temperature calculations could be done in a multithreaded manner, but it might require a substantial rewrite of how it works, and if you're not experienced with writing multithreaded stuff (Toady isn't), then it will possibly add lots of new bugs, and possibly not be any faster. It makes sense from his POV to not do it right now, although I would strongly encourage him to explore this option eventually.

As far as using GPUs for this... well, I'll disagree. I've written a fair number of accelerators on GPUs, and even with high level libraries it can be challenging to get any speedup at all from a GPU. If you have a problem that is well suited to using it, then you'll see tremendous benefit. If your problem only mostly matches the GPU's architecture, then you will probably see some small speed up (maybe 5x for that section of the program). If your problem doesn't match the architecture, expect it to run orders of magnitude slower (been there, done that).

Most people seem to believe that pathfinding is the worst offender, but I don't think it could really be moved to a GPU. Maybe it could, but I would expect it to have serious performance problems due to poor memory locality. This can be worked around with some trickery and magic, but that's stuff I wouldn't expect someone without much GPU programming experience to know how to do (which I presume Toady has none).

The temperature calculations could probably be offloaded without any real concern though, especially if you don't care about losing some precision by introducing race conditions. Fluid flows would be pretty tough to put on a GPU, so I wouldn't recommend that. The question then is how much benefit this gives you in performance, which I'm not sure of. It would probably be significant enough to notice, but even an infinite speedup would only be equivalent to turning temperature off and having no fluid flows.

The real problem with using a GPU though is that it assumes that you've got a GPU (if you're talking about using CUDA or OpenCL, then it has further assumptions). Sure, most people do, but what if you don't? The game would need to support the case where there is no GPU, so double the coding effort (and double the bugs). OpenCL helps mitigate that, but requires the OpenCL runtime and compatible hardware be present on the user's machine, so that's a no go. Or what if you have a crappy GPU and performance is worse than running on the CPU?

I'd actually propose that Toady multithread the temperature calculations as his first step toward multithreading. Moving it to another thread and just running it on a single thread other than the main thread would proably give most of the benefit of turning it off completely, and he could even ignore the race conditions without much care. So what if the tile next to a dwarf is 5 urists of heat off this frame, because the updater thread hasn't gotten to it yet when you read its temperature? Stuff is generally either on fire or not, so a little loss of precision wouldn't matter much.

After he's gotten his feet wet with that, he can try tackling multithreading the pathfinding (which is tougher than it sounds, depending on how he handles it it may not be a 100% parallel operation), and fluid flows. Fluid flows would be tough because it requires lots of synchronization (which reduces the performance benefit) or else it will have a lot of issues like destroying and creating mass from nowhere (i.e. two threads donate 1 unit of water to a nearby cell, but if they aren't synced then they may both write its current value + 1, thus destroying a unit of water).

This is tricky stuff.

Zarat · « **Reply #12 on:** September 01, 2012, 03:08:24 pm »

Quote from: Coolthulhu on September 01, 2012, 02:16:23 pm

Quote from: UrbanGiraffe on August 31, 2012, 08:09:51 pm
I think the long term optimization strategy is for computing technology to try to keep pace with DF's development. In 20 years when it's finished, hopefully FPS death won't be much of an issue.
CPUs pretty much stopped advancing when it comes to raw, single core power. Until DF becomes multithreaded, the difference between P4 2ghz and Core 20 Cent 100x6ghz will pretty much boil down to 2ghz vs 6ghz.
The worst offenders (temperature, fluids) are actually among the easiest to multithread. In fact, they could even run mostly on GPU with relatively little work.

While clock speed hasn't moved much - or technically, gone back in a lot of cases - it's definitely not the case that "raw, single core power" has stopped advancing. One core of my i7 clocked at 2.8ghz compared with a P4 at the same speed would run circles so fast around the P4 that uh, a tornado would form and blow the P4 away. Or something. There's been plenty of pipelining improvements, better branch prediction, and all the other micro architecture improvements you can think of. It is literally 2-3x faster when clock speed is normalized.

That said, there are obviously big limits to how much you can shuffle the architecture around and still get significant improvements, and we're pretty much hit a wall when it comes to increasing clock speed. Moore's Law (which, of course, isn't a real physical law, but rather how things have happened to hold up for the past few decades) says that the amount of transistors on a chip doubles every 18 months or so, not that processing power gets twice as fast every 18 months or so. There is a huge difference.

Basically, don't expect any major speed increases for the next decade unless something radical happens in materials science or we get quantum computing. Expect more cores. But there's a limit to that, too.

Quote from: Kipi on September 01, 2012, 02:30:07 pm

Quote from: Coolthulhu on September 01, 2012, 02:16:23 pm
The worst offenders (temperature, fluids) are actually among the easiest to multithread. In fact, they could even run mostly on GPU with relatively little work.

And you know this... how? Have you perhaps seen the source code? We have no way of knowing how those things (especially temperature, which is affecting everything) works and when/how the checks are made.

Also, you must remember that Toady has stated that he has no experience in multithreading. So, to gain some speed which may very well be minimal, he has to spend time to learn multithreading. And from experience I can say that it's not easy thing to do, and I'm studying IT in university.

Well, it's a pain to read since it's totally undocumented, but it's not impossible for him to have taken a look at the decompiled binary.

And generally, you don't need to see it anyway to make a statement like "fluids and temperature are among the easiest to multithread" when they are well-understood problems.

But yes, I don't think it'd be worth it in the short-term - even if you perfectly parallelize everything, that's just a 2x-4x improvement for most people. It doesn't fix the underlying problem, you just spread the underlying slowness out. DF could be plenty fast as a single-threaded application.

And no, multithreading is not that difficult. No offense man, but "studying IT in university" doesn't mean much, especially when there are universities out there who have total crap in their CS curriculum. I assume it's an European university (based on the fact in the States IT generally means "I fix computers" rather than "I understand computer science from computational complexity to grammars" while in Europe it can go either way), but it honestly doesn't tell us anything. And at any rate, Toady has a PhD in mathematics, as I recall, and given I have only a Masters in CS and experience, Toady should be much more capable of understanding it than you or I - this is really all just applied math.

Quote

The temperature calculations could probably be offloaded without any real concern though, especially if you don't care about losing some precision by introducing race conditions. Fluid flows would be pretty tough to put on a GPU, so I wouldn't recommend that. The question then is how much benefit this gives you in performance, which I'm not sure of. It would probably be significant enough to notice, but even an infinite speedup would only be equivalent to turning temperature off and having no fluid flows.

Huh? CFD is done on GPUs all the time and is much faster, thanks to fluid dynamics being highly parallel in nature.

But again, for emphasis: parallelism is not the problem, even if it wouldn't hurt.

AutomataKittay · « **Reply #13 on:** September 01, 2012, 03:26:32 pm »

Quote from: Zarat on September 01, 2012, 03:08:24 pm

And generally, you don't need to see it anyway to make a statement like "fluids and temperature are among the easiest to multithread" when they are well-understood problems.

I'd love to know where this idea comes from. Temperature I can see since it's mostly static outside of weather and magma ( and occasional fire ) but fluid I can't see as being 'obviously' multi-threaded with pressure variation moving objects and probably needing duplicate of physical mapping that'd be shared with pathfinding and pump pressures. I see a lot of statement basically saying that multi-threading is oh so easy and this and that is so obvious, I doubt it's so obvious or split off enough to be parallelizible.

I don't works with software but I've had similar experience with hardwares of various kinds where someone or else will claim that it's obvious that this or that should works, and it just don't or breaks badly, either due to things being so woven together that it's hard to split off or because of mis-match between functional details.

Zarat · « **Reply #14 on:** September 01, 2012, 03:46:12 pm »

Quote from: AutomataKittay on September 01, 2012, 03:26:32 pm

Quote from: Zarat on September 01, 2012, 03:08:24 pm
And generally, you don't need to see it anyway to make a statement like "fluids and temperature are among the easiest to multithread" when they are well-understood problems.

I'd love to know where this idea comes from. Temperature I can see since it's mostly static outside of weather and magma ( and occasional fire ) but fluid I can't see as being 'obviously' multi-threaded with pressure variation moving objects and probably needing duplicate of physical mapping that'd be shared with pathfinding and pump pressures. I see a lot of statement basically saying that multi-threading is oh so easy and this and that is so obvious, I doubt it's so obvious or split off enough to be parallelizible.

I don't works with software but I've had similar experience with hardwares of various kinds where someone or else will claim that it's obvious that this or that should works, and it just don't or breaks badly, either due to things being so woven together that it's hard to split off or because of mis-match between functional details.

I didn't say it was obvious, I said it was a well-understood problem. There are jillions of books, papers, websites, and conferences about doing it fluid dynamics in parallel. Simulating fluid flows is important for a whole lot more than games, and so there's been tons of work on how to do it quickly and efficiently - though obviously sources targeted towards game developers are likely to include some simplifications given they're not worried about total one hundred percent modeling accuracy.

EDIT: https://www.google.com/search?q=real+time+parallel+fluid+dynamics&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-beta

My favorite quote: "Real-time simulations exploit the embarrassingly parallel nature of the fluid simulation algorithms. " from here: http://www.mijagourlay.com/fluid

News:

Author Topic: Optimization (Read 7739 times)

lordlp

Optimization

Telgin

Re: Optimization

AutomataKittay

Re: Optimization

WealthyRadish

Re: Optimization

i2amroy

Re: Optimization

Pagan

Re: Optimization

CaptApollo12

Re: Optimization

Zarat

Re: Optimization

Coolthulhu

Re: Optimization

They Got Leader

Re: Optimization

Kipi

Re: Optimization

Telgin

Re: Optimization

Zarat

Re: Optimization

AutomataKittay

Re: Optimization

Zarat

Re: Optimization