Topic: How is DF not technically doomed? (Read 51683 times)

Dozebôm Lolumzalìs · « **Reply #105 on:** February 25, 2016, 07:29:51 pm »

GUYS THERE IS A MISCONCEPTION

Multithreading DOES NOT afaik say, "do this until you're done, then merge it back in. We'll keep on doing stuff until you're done." It says, "everybody work on their own thing, then once every single thing is done we'll merge it all together and move on to the next tick."

There is absolutely no difference in the user's perceived results with or without ~~you~~ multithreading, except better FPS. (As long as it's done right) Multithreading allows for multiple processors (or some techy word like that) to work individually on a part of the workload at the same time.

Everyone who's saying things like "water speed will be dependent on processor speed" or "history could go differently" is very wrong. Unless there's this other thing called "multithreading" that involves really bad coding habits.

Now, I suppose for something like non-game coding, it might make sense to have one thread move on while the other is still working. But in DF, every thread's results can affect the entire world. In the-other-coding, there might be something else Thread 1 can work on that doesn't depend on Thread 2's results. There is no such thing Thread 1 could do, save another part of the processing for the tick.

For instance: Thread 1 does all the temperature, Thread 2 is still working on worldgen in the background, Thread 3 is still working on fluids, so Thread 1 moves on to pathfinding, but at the end of the day, the tick only advances when everything is done.

Libash_Thunderhead · « **Reply #106 on:** February 25, 2016, 07:33:50 pm »

Quote from: Putnam on February 25, 2016, 03:01:55 am

game...log... written to hard drive... as you play.

the game runs faster... with an ssd...

because...

SHIET

Well I was talking about the job cancel message spam. Usually log has little to none effect against fps.

khearn · « **Reply #107 on:** February 25, 2016, 08:11:37 pm »

Normally the traffic to the gamelog is pretty minor. Well under one line per second. Even when in siege lockdown and lots of burrow-related cancellations it shouldn't be enough to seriously affect your FPS. Even when you run out of a resource with jobs in the manager depending on that resource, the writing to gamelog.txt shouldn't be significant.

As a test, I just queued up 20 gold chains in the manager, while I have no gold bars. I was at 85 FPS when I did this. I'm now getting the cancellations, but I'm only getting about 1 line per second in my gamelog.txt, and most of those lines are like this:

'Metalcrafter' Adiltaguz, Engraver cancels Forge gold Chain: Needs 1 gold bars.
x2
x3
x4
x5
x6
x7
x8

I'm seeing about 1 "xN" per second. That's pretty insignificant, even on a spinning aluminum disk. My FPS is now at 79. I then cancelled the manager job, and now my FPS is at 83. So the cancellation spam isn't enough to drop you from 100 to 20 by itself.

jwoodward48df: while I agree with you about multi-threading not causing things like speed to change relative to everything else, I can see how multithreading could cause history to go differently.

In a single-threaded world, if you start with a given random number seed, you can count on the game doing the same thing each time because every access to the RNG will happen in the same sequence. The first run through it will decide what Urist s going to do, pick some random numbers in the process, then decide what Sibrek will do, picking some random numbers, then Tosid, etc. And the next time you replay with the same seed, it will go through in the same order, and pick the same numbers, and come up with the same decisions.

But in a multi-threaded world, It will create separate threads for Urist, Sibrek, Tosid, and everyone else, and process each thread independently. Given that there are probably other processes on the machine, there is no guarantee that the dwarves will be handled in the same order each time. Sibrek's thread could be on the same processor as some malware you didn't know about that's sending out 1000 emails advertising Viagra, and thus not run as fast (or maybe you just opened gmail in chrome, whatever). So Tosid's thread hits the RNG before Sibrek's thread, and the numbers come out in a different order, resulting in different decisions being made.

I suppose it's possible to make some sort of system that creates a separate RNG seed for each thread as they are created (which should always be in the same order), so each thread will always get the same numbers, regardless of timing. But I don't know if that's actually done. If something like that really is done, then I would agree, that multi-threading wouldn't change history from run to run with the same seed.

Dozebôm Lolumzalìs · « **Reply #108 on:** February 25, 2016, 09:20:17 pm »

1. It would be "thread one is fluids, thread two is paths", NOT "thread one is Urist".

And I'm not sure how DF isn't already how you're describing it: like the universe, it is not quite predetermined from a given input. Try saving the game just before a battle, then providing no player input. If I'm right, multiple playthroughs of the same battle will produce different results. An easy way is to record your dwarves' deaths.

Orange Wizard · « **Reply #109 on:** February 25, 2016, 09:29:40 pm »

Battles will turn out differently because to-hit chances are fairly random. The AI is mostly not random AFAIK.

...

Multithreading would be much more simple to implement in this sort of pattern:

Call for tick update
Spawn threads to calculate AI, pathing, fluids, etc. based on the last tick
Wait for each thread to finish
Update the world to match the results

rather than doing the parallel updates thing you guys think to seem would happen.

Libash_Thunderhead · « **Reply #110 on:** February 25, 2016, 10:12:35 pm »

As long as DF doesn't implement something like action replay it shouldn't be a problem.

Quote from: Orange Wizard on February 25, 2016, 09:29:40 pm

Spawn threads to calculate AI, pathing, fluids, etc. based on the last tick
Wait for each thread to finish
Update the world to match the results

rather than doing the parallel updates thing you guys think to seem would happen.

Oh, I think I get what you mean. It is like putting AI in another thread other than putting Urist's AI and Tosid's AI in different threads.
For example, Urist and Tosid both move towards the same tile. The result would be different if Tosid's pathing was done before Urist's and vice versa. But still, some thread may need results from another thread to complete its calculations, means it has to wait. Fluid first or pathing first may have different outcomes.

Orange Wizard · « **Reply #111 on:** February 26, 2016, 03:03:25 am »

Quote from: Ispil on February 26, 2016, 02:34:17 am

It's more that all the processing done each tick gets divided up so that one core isn't doing all the work. The way it generally works is that you have a thread "pool". Tasks are assigned to each thread, and once they finish their task they move on to the next one. All tasks done, everything syncs up, and you're done.

This is what I was trying to say. Thanks.

NJW2000 · « **Reply #112 on:** February 26, 2016, 12:09:55 pm »

So... can I just check I'm taking the right information out of this?

1. Modern computer speed gains are almost entirely dependent on multi-threading.

2. Multi-threading is different from most other optimising, which would be doing little tricks with the code done, rather than changing the way the computer runs DF.

3. Multi-threading as opposed to optimising the way the linear computing works would be more efficiently and easily implemented now as opposed to later.

4. Whether or not multi-threading could actually work really depends on how important the order of tasks completed is in DF: whether the computer can do all or any at once, or if there's a set order.

Pls Confirm for laymen.

funkydwarf · « **Reply #113 on:** February 26, 2016, 05:25:17 pm »

So if the only way to make a single core of a processor faster is to use a smaller manufacturing process, does that mean an AMD and Intel chip at the same manufacturing process size and hertz deliver exactly the same performance? Of course not, architecture matters, and the upcoming AMD Zen chip is trying to close the gap with Intel and is supposedly sporting a 40% IPC gain from architecture alone, independent of manufacturing process.

Don't be so sure we are done making cores run faster.

NJW2000 · « **Reply #114 on:** February 26, 2016, 06:17:08 pm »

Quote from: Ispil on February 26, 2016, 02:17:58 pm

Yes to all 4, and that multithreading also requires an entirely different perspective on how the game functions as a whole. Instead of things changing the map, you have things creating a stream of pending changes to the map that are then processed each tick. There's more to it, but generally it requires a complete restructure of thought.

More specifically:

1) Yes; more so as time increases and we reach worse and worse limits when it comes to individual processor improvements
2) Yep; it requires significant code restructuring, not just a few inserted lines here and there
3) It's more efficient than simply trying to optimize linear functions when done correctly, and will always be easier the earlier you implement
4) Yep; if Toady relies on, say, the water in the area being updated before he does action processing for combat, that'll be a big problem

Regarding 4) : should someone who knows what they're talking about ask toady if a lot of DF functions like this in a future of the fortress question?

Miuramir · « **Reply #115 on:** February 26, 2016, 06:21:12 pm »

Quote from: NJW2000 on February 26, 2016, 12:09:55 pm

So... can I just check I'm taking the right information out of this?

1. Modern computer speed gains are almost entirely dependent on multi-threading.

Not really... but the diminishing returns are leveling off. The three major categories of improvement *other* than "more cores" are raw clock speed, architecture improvements, and process size. Raw clock speed improvements has been leveling off for some time, process size improvements (die shrink) is leveling off sharply, and architecture improvements are starting to level off.

Consider:
This chart of single-thread performance
This 2012 article on historical single-core performance

Back in the earlier days of DF, single-core performance was climbing at around 50% per year. Somewhere around 10 years ago, it started to level off toward 20% per year by some measures, or even negative by others. And most of the gain was from Intel.

Clock speeds have severely leveled out; a nice but not exceptional CPU from 2008 might be 3.3 GHz, and fairly few people today have CPUs over 4 GHz. In terms of single-thread performance, that 2008 CPU is about 700% of an (older) baseline, and the best available desktop today for single thread is only about 1,200%... so that's only about a 70% increase in *eight years*.

Quote

2. Multi-threading is different from most other optimising, which would be doing little tricks with the code done, rather than changing the way the computer runs DF.

Optimizing, for multi-threading or otherwise, is rarely little tricks. It's frequently a very time consuming process involving code analyzers, and doesn't port well; beyond the simplest stuff like taking out debug code, optimizing for 64-bit Linux on Intel CPU with gcc compiler may be completely different from optimizing for 32-bit Windows on Atom with Visual Studio, or whatever.

In particular, optimizing for newer CPUs frequently breaks compatibility with older ones. A significant fraction of the speedup in modern CPUs is better ways of doing things, which aren't applicable or don't work on older ones. Toady has been pretty good about leaving in support for older systems, partly because performance increases aren't as big of a concern this early, and maintaining multiple builds was a major hassle with his older setup; so having one "runs everywhere" build was the simplest solution.

Among other things, the 64-bit build doesn't need to support anything older than around 2000 at all, basically Pentium 4 era, and quite possibly not older than 2003 or so. If we are lucky, cleaning out the 386, 486, etc. structures may provide more improvements than the minor hit to word length costs.

Quote

3. Multi-threading as opposed to optimising the way the linear computing works would be more efficiently and easily implemented now as opposed to later.

Multi-threading for *today's* computers of *today's code* would be more efficient done sooner. Given that we really don't know what the computers of tomorrow will bring, it's highly speculative to say that optimization for the computers of today would help for the computers around when DF is closing in on 1.0. How much effort to optimize for the new Pentium CPUs of 20+ years ago would still be valid today? Much of the code has been replaced, and the architecture looks completely different; that's at least the sort of difference we may be looking at.

One very possible answer is that CPUs may go the way of GPUs, with *thousands* of micro-cores. In which case optimization might actually look at assigning a core (thread) per map tile and/or per dwarf, plus a few hundred for overhead. Each micro-core would be responsible for tracking a "Game of Life" like simple state set for fluids like water, magma, and sand; plus some info about obstructions and occupancy. (You could do this with GPGPU / OpenCL / CUDA programming on a GPU today, mind you; but that's very non-trivial to set up and would require a *second* powerful GPU in your system (other than any running your actual graphics), which is still rare. There is a system down the hall that I work with that has four Tesla C2075 cards; each has fourteen 32-core subprocessors for 448 total cores, and 6 GB of on-board RAM, which sounds like a lot... but it's only 12 MB per core after ECC overhead and dividing it up. Still, it's a more than 1,792 core system; it could for instance process a 32x32 sub-region in a single pass.)

If quantum processors kick in, one of the things that might be relevant is pathfinding; splitting up pathfinding into 2^(n-1) threads, where n is the number of Qubits in your quantum card, might be the thing to do.

Quote

4. Whether or not multi-threading could actually work really depends on how important the order of tasks completed is in DF: whether the computer can do all or any at once, or if there's a set order.

That is ultimately adjustable. For instance, the current fluid flow probably isn't the best way to do it if you're going massively parallel, for instance; but the subtle details of how fortresses flood under pressure may well change in the future as that is brought up to more modern standards. I wouldn't count on u-bend tricks, wave behavior to trigger pressure plate timers, and the like working the same way ten years out, for certain.

The important trick for multi-threading is *get the low hanging fruit first*. Amdahl's law will limit you eventually; nothing as complex as DF will ever be completely parallel, and cross-process communication takes an increasing toll. It's also possible that things like memory access bottlenecks dominate performance so that parallel processing doesn't even affect much; if the limiting factor is really the amount of information that you can get between the main RAM and the CPU die, how many threads you're running on the die doesn't make much difference.

Orange Wizard · « **Reply #116 on:** February 26, 2016, 06:53:19 pm »

Quote from: Miuramir on February 26, 2016, 06:21:12 pm

Consider:
This chart of single-thread performance
This 2012 article on historical single-core performance

Fixed those links for you. The first is basically pointless because it lists CPUs in order of their single-thread performance, and all that tells us is that when you order CPUs by single-thread performance, it forms a gradually decreasing line. Shocking.
The second chart tells us that single-threaded performance has stopped increasing as quickly as it was, but is still going up steadily. Which kind of counters your point of "levelling off".

Bumber · « **Reply #117 on:** February 27, 2016, 01:35:43 am »

Don't hard disk writes require an interrupt that blocks the thread? Could be bad for single-thread DF if it works that way, although maybe it's handled elegantly by the compiler/OS.

Quote from: jwoodward48df on February 25, 2016, 03:25:09 pm

Honestly, the practical solution would to have saving sync the announcement file in RAM with the announcement file on the hard drive, because if the program crashes, you lose all progress since the last save...

It's bad if you wanted to know what happened before the crash. The log prints **Loading Fortress** when you load the fortress back up, which helps weed out confusion.

Quote from: jwoodward48df on February 25, 2016, 07:29:51 pm

GUYS THERE IS A MISCONCEPTION

Multithreading DOES NOT afaik say, "do this until you're done, then merge it back in. We'll keep on doing stuff until you're done." It says, "everybody work on their own thing, then once every single thing is done we'll merge it all together and move on to the next tick."

It's still much better than waiting for the first task to complete before even beginning the next. Only problem is you can't know what order things are being done in, which makes things damn near impossible to debug.

Quote from: Libash_Thunderhead on February 25, 2016, 10:12:35 pm

Oh, I think I get what you mean. It is like putting AI in another thread other than putting Urist's AI and Tosid's AI in different threads. For example, Urist and Tosid both move towards the same tile. The result would be different if Tosid's pathing was done before Urist's and vice versa. But still, some thread may need results from another thread to complete its calculations, means it has to wait. Fluid first or pathing first may have different outcomes.

It should be doable to put each creature's AI in its own thread, as long as you can handle the conflicts properly. Two dwarves try to grab the same sock? One of the tasks needs to be aborted when the threads rejoin. It's easy enough to solve thas issue, but there are all sorts of other ones you won't anticipate. That means lots of development time and headaches.

I don't think multi-threading is worth it right now when Toady needs to keep momentum towards getting all the Arcs done and shaping the game as only he can. I don't think it entirely unreasonable that Toady might end up opening up the source a little bit after he's taken the game as far as he's going to take it. Sure, it'll be hell trying to optimize it at that point, but we're talking about a community with people that maintain DFHack.

Sizik · « **Reply #118 on:** February 27, 2016, 03:03:04 am »

Disk writes shouldn't happen unless you're saving the game, so thread performance isn't really relevant there.

Orange Wizard · « **Reply #119 on:** February 27, 2016, 03:09:25 am »

But we literally just mentioned that the game log writes to disk during play

News:

Author Topic: How is DF not technically doomed? (Read 51683 times)

Dozebôm Lolumzalìs

Re: How is DF not technically doomed?

Libash_Thunderhead

Re: How is DF not technically doomed?

khearn

Re: How is DF not technically doomed?

Dozebôm Lolumzalìs

Re: How is DF not technically doomed?

Orange Wizard

Re: How is DF not technically doomed?

Libash_Thunderhead

Re: How is DF not technically doomed?

Orange Wizard

Re: How is DF not technically doomed?

NJW2000

Re: How is DF not technically doomed?

funkydwarf

Re: How is DF not technically doomed?

NJW2000

Re: How is DF not technically doomed?

Miuramir

Re: How is DF not technically doomed?

Orange Wizard

Re: How is DF not technically doomed?

Bumber

Re: How is DF not technically doomed?

Sizik

Re: How is DF not technically doomed?

Orange Wizard

Re: How is DF not technically doomed?