I'll be honest: I'm not entirely convinced by the whole "threading would be too difficult" argument. I've done a lot of multithreading for a game engine recently and I've tried loads of different methods and you're right- it *is* very difficult. Right now I've settled on using a spatial partitioning design which is more than a little bit complicated (and would be pretty difficult to retrofit into a game like DF, but not impossible) - but it has the advantage of meaning that all real game logic (moving things around, building things, removing things, etc. etc.) can be done from a single-threaded perspective.
Imagine the game world being split into multiple little bubbles of activity (that are constantly changing after areas are processed and the 'bubble' moves on) - with non-active areas serving as a buffer between all the bubbles. The code inside the bubble works totally serially and single threaded, as long as it doesn't reach outside the bubble.
For my application it's quite easy to abide by this constraint, because it mostly involves simulating particle movements. The pathfinding used by dwarves in DF would complicate matters, because the pathfinding code necessarily has to "reach out" beyond the bubble's safe boundaries. The best solution for this, I think, would be to maintain a thread-safe, abstracted data structure designed for pathfinding that the real map feeds data into. The pathfinding dwarves would use this overlaid structure to read from, so they don't ever interfere with one another. This also has the advantage of providing an obvious place for pathfinding optimisations (caching, predictive pathing, you name it).
Er, anyway - that was besides the point. What I *actually* wanted to say was this: if adding some sort of all-encompassing data-parallelised multithreading (as I described above) would be too difficult (and I think we'd all agree that it probably would), that doesn't mean the most CPU-intensive parts of DF can't be parallelised.
Putting pathfinding and flows on individual seperate threads isn't a great idea (known as functional threading) because it doesn't scale - it'll work for now, but in 3 or 4 years when we're running 12 or 16 cores as standard we'll be crying out for DF to scale beyond 3 or 4 cores and it'll have to be multithreaded all over again.
Here's how DF could have parallelised pathfinding that scales practically indefinitely:
I'm assuming right now that DF handles the turn system by having each creature individually take his turn: assess the situation, pathfind (or not), then move.
This is hard to parallelise, as one creature might be pathfinding whilst another creature moves into the area the first is trying to pathfind through.
Instead, if toady were able to rejig DF's turn system so that every pathfinding request for that turn were all executed at once, you could offload the pathfinding to every core on the user's machine. DF's pathfinding is totally read-only (I'm assuming!) and doesn't affect the landscape it's pathfinding over, so there's absolutely no problem with thread/memory interference.
A user's 8 core i7 mega-machine could blast its way through the "pathfinding phase" then return to single threadedly handling all the creature movement, combat and so on. This would be pretty easy to implement - the only thing it'd require is that DF's turn system is changed so that allows all the pathfinding to be done at once. Everything else would be the same.
The game as a whole won't scale indefinitely (flows, general movement and combat etc. would still be just as slow), but it'll remove a significant bottleneck in current fortresses - possibly kicking up our maximum fortress populations from the 200 mark to maybe 400, 500?
The fact that 80% of gamers* are now using multicore machines (25% of gamers having a quad core!), I think it'd be a really good idea to seriously consider adding a "parallelised pathfinding phase" to DF, especially considering how easy it'd be if the turn system was changed.
What do you guys think? Any ideas?
*http://store.steampowered.com/hwsurvey/