Multithreading is great for running independant tasks concurrently; such as graphics only particle effects, audio streams, or other tasks that don't require syncronization. Basically, the overhead of splitting a fine grained task into multiple theads can easily overwhelm the gain of running on multiple cores; but anything coarsely grained enough to split off from DF's main game loop probably won't have a noticable affect on FPS; aside from rendering and audio.
I'm working on a little Tower Defense game, and was stress testing the projectile system; with the max number of towers with the max rate of fire, with the slowest projectiles; leading to thousands of bullets on the screen; bullets were drawn with direct memory writes to a raw bitmap, that would then be blitted over the battlefield; the projectile data was read from an array of structs.
I split the projectile processing into 4 threads (as I have a quad core... the code was configurable); each thread would process projectiles with an index that matched it's ID when modulused(sp?) by it's ID, so thread 0 would do 0,4,8,12,16... thread 1 would do 1,5,9,13,17... and so on. the performance on a single core appeared fairly linear; and there was no direct interaction between projectiles, they only checked if they were due to expire, updated their position, checked for collision with a pre-culled selection of targets, and wrote their sprite to a calculated position in memory.
Even with no 'lock'ing, and no danger of the the threads corrupting each others state, the multithreading version was universally slower; I believe because each write invalidated the memory of it's neighbors that shared cache lines, and the cache lines of the bitmap. It may still be workable if I figure out how to align the data used by each thread with CPU cache line sizes; but those may vary depending on CPU model...
(while describing this, I came up with an idea for minimizing memory writes that I'll have to try later...)