Assuming funds weren't really an issue, would it be possible to make like a 32gig single core? And then run that in a normal computer, and run solely DF on the 32 gig core?
Although someone was saying DF doesn't actually use that much ram, and that a higher-speed cpu would have more of an effect.
/computers are hard/
You seem to have some confusion about computers in general and DF in particular. DF is in practice much more like scientific computing than most games. Simplifying, a computer has a small amount of memory actually on the chip with the CPU, which is very fast; this is the "cache". (In practice, there are several levels of this, don't worry about that yet.) Then there is the computer's main memory, which is moderate speed. Then there is SSD style storage or hybrid disk cache, which is painfully slow, and actual spinning hard drives, which are unimaginably glacially slow.
Let's say you're at an exclusive club, and you suddenly realize you need a different credit card than the one you were planning on using originally. If your wallet is in your hand already and you just pull out a different card, that's like the data being on the stack in the CPU. If you have to pull your wallet out of your pocket and swap cards, that's like the data being in the cache. If you left that card in a different wallet in your coat pocket that you checked at the door when you came in, that's like the data being in main memory. If you left that card in your coat which you left in the trunk of your car which was valet-parked in a different building by a unionized worker who just went on break, that's like the data being on the hard drive. If your car with the wallet was handed over to a weird foreigner who promised you wouldn't have to pay parking fees, because they were going to drive it around town using it as a delivery vehicle for their catering business, that's like the data being "in the cloud".
When you start a program, it loads from the super-slow hard drives into memory. If you don't have enough memory to load the whole program (plus the operating system itself, and all the *other* programs you are using), the operating system has to play juggling tricks with the super-slow hard drive to pretend it has more memory than it does; this makes everything run very slow. So, at first, adding more memory to a computer helps *tremendously* in performance. In particularly pathological cases, going from 512M to 1G RAM in an older system might speed up certain programs by as much as a factor of ten.
But once you have enough memory to load *everything*, adding more doesn't really help much. The operating system can try and "guess ahead" what you might want on the super-slow hard drive, and use idle time to go ahead and preemptively load it into memory; but you reach limiting returns on that fairly soon. This is why most ordinary users don't see benefits from more than 8G of memory, if that; most ordinary programs still don't use more than 2G to 4G at most, with the majority much smaller.
Once everything is in memory, you have to get the info from the memory to the CPU, and back again. Most programs that ordinary home users deal with use fairly small files, and have a lot of free time waiting for user input. For DF and some sorts of scientific computing, this is more problematic. CPUs with enough on-die L3 cache to load a full embark don't exist; even the Intel 4790K has only 8M, for instance. L4 cache (in-package but not on-die) as seen in some Haswell mobile CPUs may help some once they hit the 2nd-gen desktop CPUs, but that's still only 128M. In any case, the memory access speed or memory bus speed is the factor here.
Let's switch examples. You're trying to optimize a restaurant, and have one guy who who takes clean silverware out of one bin, clean napkins out of another, and one of those colored paper sticky strips out of a box. This gets handed to another guy, who folds the napkin around the silverware and attaches the paper strip, then hands it back to be put in a different bin.
If you just had a security camera watching the output bin, you couldn't tell who was the limiting factor. If the guy doing the folding (CPU) was the limiting factor, spending more money on a guy who is better at taking things out of bins (the memory speed) isn't going to help, and vice versa. A program where the CPU has everything it needs when it needs it, but takes too long to do the work, is "CPU bound"; improving the speed of the CPU will help directly, while other improvements will help only a little if any. Conversely, a program that has the CPU waiting around idle for the memory bus to give it the next piece of data would be "I/O bound".
We don't really know under what conditions DF is CPU bound vs. I/O bound. In general we suspect that with a fast modern processor, DF is more often (or more likely to be) I/O bound, yet sensitive to errors. Thus the general recommendation that to optimize DF, figure out what the fastest long-term-stable memory transfer speed you can afford works out to; and then the fastest individual core speed that is compatible to that. Note that "fastest" is not merely GHz, either; raw clock speeds are only valid for comparing chips of identical architecture and similar generations.
Note that any sort of virtual machine arrangement applies overhead; you are *at best* fractionally slower than running without one. There are a wide variety of logistical and practical reasons to run VMs, but raw computational performance isn't one of them.
Note also that you don't need a VM to access quite a lot of memory these days; Dell will cheerfully sell you an off-the-shelf system with 4x 15-core, 2-thread-per-core CPUs (total 60 cores, 120 threads) and 96 slots of 16G each, for 1.5T (1,536G) of directly-accessible RAM. But each of those cores only has a max turbo of 3.1 GHz, and DF can't use more than 4G at most (plus some for OS overhead, etc.); a decent i7 at 3.5 GHz in a desktop with 8G of premium RAM would probably do better. (The aforementioned server monster could, of course, run multiple copies of DF at once without slowing down; but that is rarely what people are looking to do. And given memory bus limitations, it might not actually keep up full speed past 4 copies.)