Conceptually it is not that difficult to split path finding off into separate threads.
1. Assume the current modified A* continues to be used.
2. Each path finding calculation is done in its own thread.
3. Creatures initiate path finding, and display a thinking symbol until their path thread completes.
4. Creatures that change their mind about needing that path can do so since their thoughts and actions are still being processed on the main thread.
5. A creature that changes its mind sets an abort signal for its path finding thread.
6. The main thread will have map changing actions, some of which are of a significant enough level that all path finding should be held while the change occurs. The mechanics of this are not difficult. The main thread must know how many path finding threads are running. An atomic flag is set to tell each of the threads to pause, each thread responds to that pause with an increment to an atomic counter. When the counter is equal to the number of active path finding threads the changes can proceed, then the flag is cleared to allow the threads to continue.
7. Each of the threads on continuing increments its own internal counter for how man times the map has changed during its calculations. Each location that is added to the A* calculation would record what value that map change counter was at when it was added. When considering adding further locations from a given spot the counter would be checked. If the current change count is higher than that location the validity is checked and if the location is no longer valid all potential additions from that location are removed. This can lead to an incomplete path failing to reach its destination, and would start over from the source spot with the change counter reset to 0. If checking connectivity is fast enough then can be done with each map change or only when a starting over. This can result in some sub optimal paths when map changes occur during calculation, and so what dwarves do dumb stuff involving lots of extra walking all the time.
I think that covers all the basic conceptual components of switching path finding to multiple threads. I barely touched on implementation. The implementation would involve major changes. Among the first of which is to assure that any map changes that occur will not invalidate any pointer or array index held by the path finding calculations. The next major implementation hurdle is changing the creature behavior so that the main loop has a periodic check for did the path finding complete. The list of implementation details which can decide whether speed is gained numbers in the hundreds, if not thousands. If you look at what I said in conceptual item 6, you might notice that I am describing peeling apart a mutex structure into a combination group mutex and semi semaphore. Writing implementation details like that only comes from years of experience working with multi threading. Without that experience a programmer will tend to select from an existing mutex or semaphore and may even have the execution time that a user can perceive increase (it looks slower).
Overall, I think there is very little noticeable speed to be gained by splitting path finding off in this fashion. Threading would make a difference at specific moments when many path requests occur all at once, but at all other times the apparent gains would be minimal. Writing multi-threaded code has a steep learning curve and add whole new layers of complexity to debugging. While the path finding is a good candidate for asynchronous calculation, doing so would likely require Toady to share some portions of his existing code with a programmer that is experienced with the pitfalls of threads. Toady could then get a least enough advice to avoid some of the pitfalls. It would be a good experience for Toady and might lead to him recognizing other areas of DF that are good candidate sections for multi threading.
Toady does have a little experience working with others. BHeyler made the first SDL port, working from the Battle Champs source which Toady released specifically for that reason. That port lead to Linux and Mac versions by becoming a separate library section the source of which is included in every DF release. Baughn optomized the openGL code in that port. I recognized that the keyboard and mouse connections were a major time sink and rewrote them. Most of my code was ultimately scrapped because of style issues and availability to debug, Baughn wrote a replacement for what I had done with a similar design basis. Coordinating specific code would be difficult for Toady, but getting advice on how to do some things might not be as tough.