Dwarf Fortress, from what I've read and know of it, is one of these cycle-intensive, computationally simple tasks (like say, rendering a 3D scene in a game).
Unfortunately, it's not computationally simple from a GPU's point of view.
The deal breaker is that GPUs ideally work under the following conditions:
1. You can break your problem up into thousands of instances of small, identical problems.
2. Each instance should access memory in a regular pattern related to its neighbors (i.e. thread 1 reads memory address 1, thread 2 reads address 2, etc.).
3. As many threads should follow the same control path as possible (that is, every time you have an if statement in your code, you're introducing a potential performance problem).
4. You should have as little synchronization as possible, and ideally none.
Some of those restrictions have probably been relaxed since the last time I did serious GPU work a few years back, but probably not too much.
So, you can't readily do things like creature or item updates with it (way, way too much synchronization is needed), or background world updates. Pathfinding might be possible and I'm sure someone has implemented A* on a GPU. Temperature updates might be possible, but it's tricky to do it right because of the need to interact with neighboring cells. The weather simulation is similar. Fluids, a major source of FPS drain, would probably very difficult to implement.
If anything, pathfinding is probably the only thing that would really be worth the effort, but as Bumber suggested it would be much simpler to do it with multithreading on the CPU and probably for similar gain. In order for Toady to make pathfinding work on GPUs, he'd need to rewrite the pathfinding code using CUDA (only works on Nvidia cards) and/or OpenCL (supported less well on Nvidia cards but works on AMD cards), then write code that is able to detect if your system supports the accelerators (on multiple platforms) and transfer the path information back and forth to the GPU. Not super hard to do, but it's daunting compared to something like OpenMP (just adding extra compiler pragmas for the most part) or pthreads (mostly adding extra function calls here and there).