That works as advertised with a multithreaded application, I am discussing gaining some performance by not hopping the thread, the single thread of DF, from one core to another, there IS overhead for that. Technical details to back that statement up available if demanded. Toady has been making statements where in he is beginning to take the first steps towards making DF multithreaded. Once he does that, then the more cores the better, just as you are saying.
Briefly, you are technically correct about there being core switching overhead; but it's generally quite small. What is much more noticeable is that on many modern CPUs, they have a "turbo" mode where they run faster with fewer cores; this is largely due to the ability to thermal-cycle amongst the cores while doing fractional on-demand overclocking. If you have a quad-core CPU rated for, say, 3.0 GHz nominal, with up to 3.8 GHz under Turbo Boost (single core max), it's getting that higher speed by rotating the "hot" core around periodically, and idling the previously-hot ones to give them time to cool off. If you lock a program to a single core, it's not going to get the benefit of that, and will probably run at a lot closer to 3.0 than 3.8. This sort of difference is usually *dramatically* more significant than the tiny boosts you may or may not get from not switching.
TL;DR: modern CPUs and operating systems are, in general, much better about managing system resources on a sub-second basis than anything you can do manually; and much of the old-school advice from many CPU generations back is actually a performance hindrance in the modern day.
Additionally, to reinforce what has been mentioned already... on a reasonably-nice, reasonably-modern system that isn't weirdly resourced starved, it is probable that the limiting factor for DF is the time it takes to read and write quite large amounts of data back and forth between main memory and the various layers of cache. (No current chip has enough on-die cache to store even a small fort's data.) So, sustained memory bandwidth / speed is where you will probably get the most improvements if you want to make a system particularly powerful for running DF.
Note, however, that due to the nature of DF, stability and lack of errors in that memory access are tremendously important; many people who overclock run too close to the ragged edge. If you can't run a demanding memory test for several hours straight with no errors, dial back because your settings are too flaky for DF.