And it's also the case that- I would propose- Moore's law is a faster force than the growing complexity of DF. By v1.0, DF will be unbelievably complex compared to what we have now, but it won't matter, because computers will be so incredibly powerful that it might take a 1000-dwarf fort to bring the speed down at all; the sorts of things that kill FPS won't end up making the player abandon the fort because the game lags, but because those are the same sorts of things that, taken up to eleven, will render the game so complicated just in terms of keeping track of the runnings of the fortress that you'll close down the game long before the FPS starts to take any sort of a beating.
The difficulty is that this doesn't seem to be the way things are going. Remember,
Moore's Law predicts transistors per chip; even if it continues to hold (and that's a substantial "if", given some of the physics limitations involved), in current designs those have been going to provide "wider" computing (more pipelines, more cores, etc.). There are some significant issues with continued die-shrink, which has generally provided a fair chunk of the speed gains. Improvement in the "logic" of the chip tends to be more in the realm of 10% - 15% per 2-year Intel tick-tock cycle.
For comparison, in early 2006 (six years ago) a good Pentium D (inefficient, early 2 core / 2 thread) could be bought with a 3.73 GHz stock clock and a rated TDP of 130 W (quite hot for the time), which could be overclocked to around 4.2 GHz with comparatively ordinary measures. The best first-run Ivy Bridge processors, supposedly launching soon (early April of this year?), will have a 3.5 GHz stock clock, with easy turbo to 3.9 GHz; but it's an optimized 4 core / 8 thread design with a nominal TDP of only 77 W (tri-gate transistors supposedly far more energy efficient). The Ivy Bridge will probably run DF slightly faster than the Pentium D would, but quite possibly not by a large margin. We need a major breakthrough in how chips are made to get really significant single-thread speed increases; and while there is obviously quite a lot of research in that direction, there's nothing one can count on.
Realistically, much of the visible performance increases of modern applications is having enough memory (RAM) to keep the entire application and OS in memory without swapping to disk (at least one order of magnitude slower). As long as DF is limited to 32-bit space and therefore taking up no more than 2 GB for program and data combined, 3 to 4 GB of system memory should hold everything, and there are no further speed gains to be had in that direction. I think it's actually more realistic to push for DF becoming large address aware, if not actually 64-bit clean; that should allow more reliable large embarks and/or long histories without slowing down, and may also help on pipelining on newer CPUs.
Eventually, DF needs to make better use of multiple threads (and cores). It is a daunting task at any level; I have some hope that we'll see some limited progress eventually similar to having the graphics rendering spun off (temperature / flow calculations seem a likely candidate). However, right now DF can't take advantage of more than 2 cores (one for the main thread, one for the OS and the graphics handler) or 4 GB of RAM; and even given a crazy budget you can't buy or build a system that runs DF that much better than a good system from a few years back.