You're only partly right. I can confirm a regular 10% FPS improvement by dedicated DF to a core and undedicated everything else. They may be idle but the moment they're not, CPU and cache ARE used. And when you have such a limited L1 as 128K and L2 as 512K, every bit helps. (The L3 is 6 MB, but that's shared between cores.) And remember that L1 is typically 5x times faster than L2, and L2 is 2-5x faster than L3: L1 is usually 50-150x faster than L3. Thats also assuming a perfect cache-hit rate, a perfect associativity scheme.
If the system was Single-Core, then disabling services would NOT help, since no matter what there's the same cache. However, since DF is largely single-threaded, giving it a dedicated CPU core is the best possible way to max performance on a multicore system. This prevents anything, ever, from using that core or cache for anything except hardware requests and other sub-kernal tasks. Even system processes, services, explorer, etc. can be restricted using Process Explorer. It's a very handy tool.
Plus, I can overclock that core to as high as 3.6 GHz stably, and the others at 800 MHz. (However synchronously it maxes at 3.3 GHz. 3.4 works but gives minor errors after a day or two in prime benching.)