Ideally, you would want your RAM to run at the same clock rate as the CPU's instruction cycle, but this hasn't been realistically possible for many many years now.
This is why you see two different figures for a CPU's speed. It's rated clock frequency, and the speed of the Font Side Bus. (FSB)
The FSB speed is the clockrate in Mhz at which data can be fetched from memory, and stored in memory. The CPU's clockspeed is the FSB speed, multipled by some clock multiplier value.
Obligatory wikipedia on FSBThe faster the FSB clockrate, the faster the CPU can get at the data in memory.
In addition to the raw clockrate, there is also the raw width of the access itself. (How many bits at a time, or the
word size.) A 64bit CPU has a larger word size than does a 32bit processor. This is where "Dual channel memory architecture" and pals come in.
More obligatory wikipedia by using multiple, identical channels to identical memory modules, multiple words can be read/written by the memory controller at a time, allowing larger chunks of memory to be accessed directly per fetch/store cycle.
Because DF uses such large amounts of memory, its major constraint is NOT how quickly it processes its information, but rather how quickly it can get its data in and out of memory-- As noted, the FSB speed is some non-trivial factor of the cpu's clock speed, which for modern CPUs is always slower than that clockspeed. This means that applications like DF spend some significant amount of time just sitting around twiddling thumbs waiting for data to get read or written before it can move on to the next thing it needs to do. DF's data structures are very large. (MANY words large, Many many many words large). This means that DF's process spends a very large amount of time getting data into and out of the CPU's registers. When this happens, and it is the primary limiter on performance, the process is known as being "
Memory bound"
The ideal computer for running DF on (which does not currently exist!) is a very fast CPU, with matched clockrates (1:1 relationship between FSB speed and clockrate), with a very very large word size.
Processor cache serves 2 major functions.
Firstly, it stores program instructions so that they dont have to be fetched from memory after an operation completes. Keeping useful instruction information in the cache is one of the purposes of a thing called "
branch prediction", which is a feature of modern compilers. Toady gets this kind of optimization for free just by turning on optimization flags with his compiler. (The compilers these days often do a better job of optimizing this than does manual hand-assembler optimization, in case you were wondering.) By keeping program instructions in cache memory, which DOES run at 1:1 speed with the CPU's clock, (and in conjunction with branch prediction) very complex operations that have conditional checks can work with data in the CPU's registers, and never skip a beat. The CPU just loads instructions from the instruction cache, and does its thing, completely without any need to touch system memory at all!
The other function of processor cache is to cache the contents of frequently used memory addresses. In the case of DF, one of the most frequently accessed memory structures is the item registry vector, and another is the entity registry vector. (The first is a structure that contains the data for all the non-living objects currently being evaluated, and the second is a structure that keeps track of all the living creatures, including dwarves.)
A very large cache would allow larger structures to be accessed immediately by the CPU, rather than having to force the CPU to simply wait around. CPU cache is only important for system architectures that do not have a 1:1 speed pairing between memory and the internal clock speed. (This is why the ideal system does not need it! Main memory *IS* cache! This is also why the ideal system does not exist!)
So, when going shopping for a workhorse that can play DF like a beast, you want a system with multichannel memory architecture, with as much cache as possible, and the highest clock rate you can afford, with the fastest FSB speed you can find.
That said, there are some things to be aware of!
1) Multicore chips have a unified/shared cache pool! That is to say, the 1mb or so of processor cache is shared between all of the cores on that cpu's die!
2) Server grade CPUs tend to have MUCH larger cpu cache sizes built into the chip.
This means the best boards you will find for running DF on are not going to be the fancy gaming rigs you find all over newegg. Instead, you are going to find much better love looking for server boards. Like this:
http://www.tyan.com/Motherboards_S7070_S7070WGM2NRThis has 2 discrete CPU sockets, intended for high horsepower XEON processors.
Modern XEON processors can have up to 24mb of L2 (On processor) cache built into them!
http://en.wikipedia.org/wiki/XeonCompare this with the largest you can get with consumer i7 chips, which tops out at around 10mb for high end chips.
(More than you ever could possibly want to know about intel's CPU offerings and how they stack up. The advantage of server boards, like that referenced tyan offering, is that they use discrete CPUs, each with its very own CPU cache. This means that stuffing DF on one of the CPUs, and other system processes on the other CPU, will allow DF to run without any interference at all from the system processes. That CPU's cache will not have to be shared with other processes instructions. This coupled with the very large cache size, and 4-channel memory architecture. That tyan board would run DF like a beast.
It just so happens that it would also cost you a small fortune.