Technical details, for the overly curious:
The only reason this hack is needed at all is because DF does a lousy job at rendering. To put it simply, its printing algorithm works as follows:
For each tile, draw its background color. Then draw the actual tile, in the foreground.
Afterwards, flip the buffer so they actually go on-screen. (This is a problem - they should not go on-screen before, and depending on the driver it may be storing an awful lot of details for later use.)
Well, the algorithm's not really a problem in itself. The problem is that it uses immediate-mode opengl rendering to do this, which was deprecated somewhat over a decade ago and is now horribly slow on all but the best drivers. As gamers who don't buy workstation boards, we don't have the best drivers. There are, of course, better ways to do this, but DF doesn't use them.
Solution: Write a library that shimmies between DF and OpenGL, intercepts DF's calls, figures out what they are trying to draw, and draws them using better techniques. That's what I'm doing.
To be more specific, I'm building a tile catalog to upload as a single texture (as opposed to the single-tile-per-texture scheme used by DF - this is why fullscreen toggling doesn't work, by the way), once, and then once per frame I'm gathering all those tile-drawing calls into an array which I'm uploading as a "texture" to the GPU once per frame. I then have a pixel shader read both textures, using the second to decide which tile from the first to draw, as well as in which color.
This reduces the number of opengl calls to something like four per frame, as opposed to the 60,000 I often saw from DF before.
Well, if there are any questions... it's late, I'm going to sleep, but I'll do my best to answer them tomorrow. Have fun.