So the mersenne twister spits out a long string of bits wich is saved (cant see where) then used for everything, from world gen, history AND combat. Wich way is most ressource expensive : keeping a loooong string of random numbers (first the 19937 bit seed, then each iteration result) or launching/stopping the twister whenever needed?furthermore, if the same numbers result in oddities during worldgen, wouldnt the same string make weird combat results? (During the siege of xor, 1487 elves were defeated by 1 human, one dwarf and an elf, no survivors)
You still seem to have some serious misunderstandings about how it is used. The MT is almost always handled as a function; the user program just calls for the next random number it generates whenever it needs a random number. There's nothing that "runs" (starts/stops) except when you invoke it; it steps (iterates) one time each time you need a "random" number. The whole point of a seeded pRNG is that you don't need to keep but a few key parameters at any time; it's extremely storage-efficient at the expense of a small amount of computation.
I will point out again that in some common languages, such as PHP, MT is implemented as a core function; you simply call mt_rand() instead of rand() and get both better statistical behavior and faster results. For a good visual example of why using a statistically "better" pRNG can make a difference when you are working with large data sets, see the
discussion of the PHP mt_rand() function, and in particular the following image:
The left half of the image generated "random" noise using PHP rand(), which is based on libc rand()... "Many random number generators of older libcs have dubious or unknown characteristics and are slow." The right half of the image generated "random" noise using exactly the same algorithm, but using mt_rand() instead of rand(). Notice how the left side shows distinct patterning, where the right side looks a lot more like random noise.
I should stress again that different pRNGs have different optimization characteristics. MT is
NOT suitable for cryptographic purposes; it was specifically developed to provide a high-quality and efficient pRNG for large Monte Carlo simulations. It just so happens that those needs match very closely with what you want for a simulation game like DF.
For instance, one of the problems that some (particularly older) pRNGs have is that they do not sample the available space well; if you plot their output in three dimensions, the numbers they generate will all fall on a limited number of planes. If you are using the pRNG to provide test input for a program, some combinations of input will simply never happen, which could be disastrous if your program had a bug that only occurred on a combination of inputs that wasn't in the output space. MT's performance was specifically intended to address these issues while remaining fast and simple enough that it could be called many times when running a large number of test cases.
A cryptographically appropriate pRNG needs to meet a different set of criteria, which loosely can be summarized as not being able to guess the seed or the constants from watching the output. It generally doesn't matter whether it has good spectral coverage of a problem volume. MT by itself fails badly as a crypto pRNG; observing only a few hundred sequential raw output values may allow you to reverse-engineer information about the input parameters. But that isn't important at all for MT's intended use. For instance, in DF the user never sees the raw values anyway, and they are used in such a variety of different ways in different parts of the program that it would be unreasonably difficult to derive any substantial value from being able to predict the next number. Besides, in DF the user can already see the seed; there's no secrecy requirement.
I'd also like to clarify an earlier comment... almost all "gaming" dice (what some people think of as D&D dice, boardgame dice, etc.) are statistically flawed. The vast majority of d20, d10, cheap d6, and so on are tumbled in a rock polisher, typically twice with coarse grit (before and after paint) and once with fine; those round edges make them significantly less random. Many high-sided gaming dice will start to show bias patterns in as little as a few hundred trials; almost all will display statistical flaws once you get into a few thousand trials. See
this video from Louis Zocchi for a great explanation. Casinos have a huge vested interest in having their dice actually be random; they use sharp-edged dice from high-quality, high-impact, transparent plastic; and they replace them frequently to avoid wear problems. The manufacturing techniques and materials are fairly expensive, and are much harder to apply to dice with shallow-angled faces such as a d20. So, a high-quality pRNG (such as MT) used in a correct fashion is actually significantly more random than actually rolling physical, non-ideal dice in most if not all cases.
Additional note: Like many other parts of computer science, pRNGs are still in a comparatively early stage of development. When MT was first developed in 1997, it was a dramatic improvement in the field; for many statistically important purposes, it produced unambiguously better results yet ran faster. For several years following, upgrading to using MT instead of whatever dubious built-in function you had by default was a clear win; and eventually it started to become built-in to libraries and even languages. MT is likely to stay around for a while; it's built into many things. However, that doesn't mean that development isn't continuing.
WELL (Well Equidistributed Long-period Linear) was developed in 2006, and allegedly has superior performance to MT in certain cases. In particular, if you start MT with a seed that has too many (binary) zeroes in it, it can take several calls for the numbers to have the degree of seeming chaotic unpredictability that you generally want.
In some sorts of Monte Carlo simulation, this isn't a significant drawback; if you're running a million trials of some test, and each one is no more or less significant than any other, having the first few be a hair less random isn't a noticeable problem. On the other hand, for certain gaming purposes, the first few rolls may be "more significant" than others. Imagine, for instance, a hypothetical game or utility that generated a "random" D&D character using a seeded pRNG; the first roll picks the class, the second the race, and so on. With a poorly-chosen user-supplied seed that isn't checked, you might get a significantly higher or lower number of one class, since that very first roll sets the context for subsequent rolls. This is why you usually seed a pRNG like the MT with either a user-supplied value in cases where it is easy to retry or not important; or use some function on the clock that gives you a "moderately random" number to use as a good seed to MT.
There isn't that much info out there on WELL, perhaps because MT is "good enough" for most purposes it's intended for, and these days is very easy to use. Adding the additional complexity of WELL may simply not be worth the extra work for most cases. There's nothing to say however that tomorrow, or eventually, someone may come up with a very clever pRNG that is as much better and faster than MT as MT was its predecessors.