Topic: Mersenne twister in df (Read 10767 times)

BinaryBeast1010011010 · « **Reply #30 on:** May 29, 2012, 05:46:05 am »

From what I gather, the twister uses a long mersene prime, divide it in segments then takes the first segement, shift it 11 bit then use the temporary first segment created xoring it with the non shifted first segment. Then do the same op with all the remaining segment of the string.
Then if after being xored it is odd it is xored again with 90BDF (or something like that I dont recall)
Question being : here what is the mersenne prime? Is it the seed? What use is the value used in case the fragment is odd?

Thief^ · « **Reply #31 on:** May 29, 2012, 07:52:47 am »

A mersenne prime is a number that can be calculated as (2^N)-1, that ends up being prime. It's the "constant" behind the mersenne twister random number generator. A generator's constant(s) determine the numbers the generator creates much like the seed does, but have much stricter requirements as to what they can be, so they're normally fixed and not changeable.

EDIT: In this case, there are only 47 known mersenne primes, some of which are too small to use and many which are too large to be practical. The exponent (N) of the mersenne prime used also determines how many bits of ram are needed to maintain the state of the generator. The most commonly used mersenne primes are 2^127-1 (nicknamed TinyMT, needs only 127 bits of ram, 16 bytes), 2^19937-1 (MT19937, needs 19937 bits, or ~2.5kB) and 2^11213-1 (MT11213, needs 11213 bits, or ~1.4kB). All of these have had their randomness tested very thoroughly, so I wouldn't recommend using any others without a doctorate-level knowledge of randomness.

Thief^ · « **Reply #32 on:** May 29, 2012, 08:02:19 am »

The "xor if odd" step appears to be the "twist" from the name "mersenne twister". It apparently fixes problems with random number sequences repeating over shorter time than they should.

BinaryBeast1010011010 · « **Reply #33 on:** May 29, 2012, 09:22:42 am »

haaan, I understand now. but when does the seed get used?

Thief^ · « **Reply #34 on:** May 29, 2012, 09:50:49 am »

The seed sets the initial state of the generator, and then isn't used again. With Mersenne Twister, the initial state is so large that usually a simpler RNG (like the standard C one) is used to set up the initial state from a smaller seed.

BinaryBeast1010011010 · « **Reply #35 on:** May 29, 2012, 11:24:51 am »

first :is the seed randomness important in any way? I'd say no but wouldnt be the result even more random?

second : so, a 19937 bit long array wich is the seed, divided into 624 bit long segment, the first 32 bit segment is shifted to make temp, then you xor temp and the first segment (nonshifted), if temp is odd, you xore again with 9908b0df.
then you go on with the next 32 bit segment. is that it?

third question : why 9908b2df (having a 32 bit long value is one thing, but would have anything else worked?)
fourth : would the middle square method be reliable to generate the seed?

Thief^ · « **Reply #36 on:** May 29, 2012, 01:21:01 pm »

1: The initial state for mersenne twister doesn't need to be random, but if it isn't you need to run it a significant number of iterations before it will output numbers that will pass randomness tests.
The other effect of how random the seed is is how likely you are to get the same seed twice. Generally the current time is used in some way in the initial seed, as it's never the same twice.

2: Not quite. It's: temp is made from largest bit of the first 32-bit section and smallest 31 bits (i.e. all but the largest bit) of the next 32-bit section, resulting in 32 bits. temp is then shifted right one bit (removing that bit). If that bit is 1 (i.e. temp was odd) temp is xor'd with 9908b0df. Then temp is xor'd with the 398th segment (397 after the current one) of the state. The repeat with the next 32-bit segment.

3: The only thing I've managed to find out about that number is that the highest bit must be set. This has the effect that if the lowest bit (that gets removed by shifting) is set, then the highest bit gets set, and if not, it doesn't; effectively wrapping the bit to the other end of the number. I don't know the effect of using any other number.

4: The middle square method is pretty useless for anything, as there are many situations in which it will output very short repeating patterns. e.g. inputting the number 7600 and it repeat 7600 over and over forever.

BinaryBeast1010011010 · « **Reply #37 on:** May 29, 2012, 02:05:11 pm »

an example :
1100 1101 0111 0100 0011 0101 1111 0110 first 32 bit
0011 1010 0110 1100 1111 0010 0001 1001 second 32 bit

temp would be 1111 and 0011 1010 0110 1100 0010 0001 1001
shift right : 0111 1001 1101 0011 0110 0001 000 1100
not odd, dont xor with 9908B0DF

then you xor that with 1100 1101 0111 0100 0011 0101 1111 0110 first 32 bit?

and next iteration you do it using 0011 1010 0110 1100 1111 0010 0001 1001 with the next 32 bit fragment?

thinking it over again, the greatest bits...
temp would look like :0 1011 1010 0110 1100 1111 0010 0001 100
dont xor with 9908b0df
xor with 1100 1101 0111 0100 0011 0101 1111 0110

start again with the second and third fragment

wich one is right? the second I would guess

Thief^ · « **Reply #38 on:** May 30, 2012, 03:37:56 am »

The 2nd is right except for two points:
1: temp is checked for being odd before shifting it right by one, so it would be odd.
2: you don't xor with the first value as the final step, you xor with the 398th one.

BinaryBeast1010011010 · « **Reply #39 on:** May 30, 2012, 04:22:16 am »

So when you reach the last two values, you use the 2nd, xor it with the first and that's all?
If you want to start again do you use the first value as last and xor it with the result of the first iteration?

After how many iterations like these do patterns appear?
On the usability of rng in games like dwarf fortress, do thousands of rolls are generated then stored or is only the seed stored?
The second would seem more sensible but it cant explain why two same seeded world gen end up being different. The world would change every time you load a game would nt it?

Thief^ · « **Reply #40 on:** May 30, 2012, 05:31:06 am »

No you wrap round. So the 2nd uses the 2nd, 3rd and 399th. When the last number goes off the end just wrap it to the beginning. Lets you just keep going round and round.

Mersenne twister has a "period" of the same as the mersenne prime used, so it repeats after you generate 2^19937-1 numbers. Or such an insanely long time that it might as well never repeat.

With DF, the original seed is stored, but also all of the results. This is because the results are used for things like the shape of the land in your embark, and you can change that in-game.
If you gen the same world twice (both the same settings and the same seed) then it will give you an identical result. This is one of the awesome features of using a pseudo-random number generator instead of a true-random number generator (which can't be seeded).

Thief^ · « **Reply #41 on:** May 30, 2012, 05:43:00 am »

I'd also like to add that understanding how Mersenne Twister works won't get you any closer to understanding why it works. That requires a lot of high-level maths that I just don't understand.

In the end, you only need to know the "how" if you are coding an implementation of it. In practice you can just download a MT library and use it directly without knowing how or why it works, only that it does and is really damn good.

Starver · « **Reply #42 on:** May 30, 2012, 05:57:49 am »

Quote from: BinaryBeast1010011010 on May 30, 2012, 04:22:16 am

On the usability of rng in games like dwarf fortress, do thousands of rolls are generated then stored or is only the seed stored?
The second would seem more sensible but it cant explain why two same seeded world gen end up being different. The world would change every time you load a game would nt it?

I cannot speak for DF, but what would normally happen in any program that uses a PRNG is that it is initialised with a seed-value[1] and then as each new number is required from the PRNG they are requested. The internal state of the PRNG is morphed from its previous condition to the new one and the single new number putput. (Or the request is fulfilled and then the state morphed, it's functionally the same.)

As to whether "thousands of rolls are generated and stored", well, that's up to the code that collects the PRNG output. If you said "for n=1..1000 do Num[n]=NextRandomNumber()", then you'd indeed be storing thousands of numbers. At the end of the loop, the PRNG function will still be only storing one example of whatever internal state form it normally holds. (But, normally, you'd be reaching a decision point, asking for a number in order to make that decision, then continuing on on whatever path that decision led and then requesting another decision point.[2])

The seed itself, by this time, will be long forgotten. (Unless you poke around into the private data of the PRNG and can somehow perfectly retrace the morphs back to the original state, and translate that back to the seed itself[3], but that's well outside the context of any program that would use it, even if possible.) The fact that the seeds used are provided when you export your worldgen details from DF is a separate functionality provided so that you can recreate a world (even one where you let the computer start with random seeds... What seeded the system that gave the original random seeds is totally irrelevant to this example.

[fakeedit: I don't disagree with Thief^, BTW. I might look at odds with the statement "With DF, the original seed is stored, but also all of the results.", but I would class "all the results" as not storing the generated rolls but only the results of the rolls. i.e. passed through so many other processes as to effectively make them "game data" instead of "the specific list of original random numbers used to create the game data". Which, although there's obviously a causation, is not what I'd call the same thing.]

If I understand your original question, because I'm not absolutely sure I do.

As to the workings of the twister itself, I don't think I can add to the explanations already given. And if the pseudo-code snippets you provided are accurate (funny formatting, aside) that tends to be fairly understandable and should be sufficient.

[1] As described before, either itself 'randomly' obtained (e.g. from the clock time), or a fixed value, or (as an optional alternative to the former, as seen in DF and Minecraft especially) a user-provided seed.

[2] Though this might well be "set height at coordinates [0,0] to something based upon one random number; set height at coordinates [0,1] to something based upon another random number; set height at coordinates [0,2] to something based upon yet another random number; etc..." which could of course be in a loop, but you can argue whether that's the same as storing the source random numbers. Especially if in that loop you are restraining each adjacent height according to those already known, either by scaling and translating the values produced or by rejecting invalid results (in which case you might be requesting several random numbers for a single value and rejecting those that you don't like, but that's a very wasteful way of doing it[4] and you'd be better off with the scale-and-translate method).

[3] Given a possible many->one relationship in the forward direction, unlikely.

[4] The fact that DF has "rejects" during worldgen is down to the complexity involved giving very little chance to have "forward looking" logic (that would mean that it can see that this line of worldgen is 'doomed') or to avoid at the time falling foul of the worldgen rejection criteria. For example, you want creatures and population to die during the history creation, or whatever, but if you have a lower-limit on those that remain you can't just say "right, enough have died now, we can't have them be killed any more". So you start again. (Although, IME, it's mostly geographical/geological causes for worldgen rejection, like "There aren't enough volcanoes!", but you can't suddenly start putting loads more volcanoes in the yet-to-be-formed part of the world, without it looking odd.)

BinaryBeast1010011010 · « **Reply #43 on:** May 30, 2012, 07:51:56 am »

399th? Isnt the array made of 624 elements (minus 31 migrating unused bits)?
If we start counting from the left the last element will be xored with the first one?
In the case of "hello starver" written with mountains chains or trees on a map, I think there is a difference, since the whole world is generated and can be view but the local areas would be generated when explored (wether in fort or adv mode) what would be the point of eating memory for parts of the world that will never be explored

Starver · « **Reply #44 on:** May 30, 2012, 08:49:12 am »

Much aside from the "Hello Starver" being perhaps not an actual realistic feature on my terrain wish-list, I might be mistaken, but AIUI the terrain generator works at the Macro level (world-scale) to identify the basic heights and features and nature of the "embark blocks" (the single unit of which is the embark squares which can be chosen in groups, e.g. 2x2, 3x3, 4x4, at the embark stage). The parameters applied to each blocks, plus the nature of the bordering areas are then taken upon embark (or, while adventuring, when wandering through a particular area) to paint the finer details. Magma tubes, for example, are notoriously contained within a single embark-block, and if an embark block has been told it has a river running from north to east, it will now generate (in a manner that is consistent between visits, in Adventure Mode, and similarly similar if that embark-block is included within a Fortress embark area) a river that correctly matches up with the northward block's southerly-exiting river and eastward block's westerly-entering river, and bends between, with any given features (waterfalls, ox-bow-lakes, whatever it is that the terrain demands) rendered to it.

As and when a player makes any changes (damming the river, diverting it, building upon it) then a 'delta' (difference to the normally-calculated state) is saved, as it might be if there's something changed in Adventure Mode (oh, I don't know... forest fire? ...and I think the presence of dropped goods/unretrieved loot from killed beings, subject to some "shuffling around" if not in a lair/base of some kind). But there's no actual need to store the exact location of every gemstone deposit in the geological column, because their location is calculated as and when needed (unless at the surface, or at the exposed edge of a cavern, really only important to in Fortress Mode, but doubtless still there anyway) in a repeatable manner as my aforementioned "cascading seeds" for increasingly localised details.

Or so I surmise (we've all seen "suspiciously familiar features" crop up every now and then, hinting somewhat at the method of formation, but at the same time can see that the pool of features that can be dynamically created is not small, showing that there's quite a bit of distinct entropy/resolution provided for the generation process), but someone will be along in a moment to correct me, I'm sure.

(And, in which case, all I need to do is isolate the embark-tile code and brute force all possible tile-conditions (including neighbouring tile qualities!) and run the results through some loose-but-powerful pattern-matching algorithm to find the words "Hello Staver" (or at least submit the most likely candidates for my personal review!), then back-derive the worldgen conditioning that might form such a localised tile-pattern, and from that seek the seeds that form it. All I need to do.

)

News:

Author Topic: Mersenne twister in df (Read 10767 times)

BinaryBeast1010011010

Re: Mersenne twister in df

Thief^

Re: Mersenne twister in df

Thief^

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Thief^

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Thief^

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Thief^

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Thief^

Re: Mersenne twister in df

Thief^

Re: Mersenne twister in df

Starver

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Starver

Re: Mersenne twister in df