Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: 1 2 3 [4] 5 6

Author Topic: Mersenne twister in df  (Read 10768 times)

Thief^

  • Bay Watcher
  • Official crazy person
    • View Profile
Re: Mersenne twister in df
« Reply #45 on: May 30, 2012, 09:25:13 am »

399th? Isnt the array made of 624 elements (minus 31 migrating unused bits)?

1st: Use one bit from #1, 31 bits from #2, and xor with 32 bits from #398
2nd: Use one bit from #2, 31 bits from #3, and xor with 32 bits from #399
3rd: Use one bit from #3, 31 bits from #4, and xor with 32 bits from #400

etc.

when you hit #625 (which doesn't exist) you wrap back to #1.
so:
227th: Use one bit from #227, 31 bits from #228, and xor with 32 bits from #624
228th: Use one bit from #228, 31 bits from #229, and xor with 32 bits from #1


623rd: Use one bit from #623, 31 bits from #624, and xor with 32 bits from #396
624th: Use one bit from #624, 31 bits from #1, and xor with 32 bits from #397

*625th: Use one bit from #1, 31 bits from #2, and xor with 32 bits from #398
(same as #1, you've now looped round and are doing it again)
Logged
Dwarven blood types are not A, B, AB, O but Ale, Wine, Beer, Rum, Whisky and so forth.
It's not an embark so much as seven dwarves having a simultaneous strange mood and going off to build an artifact fortress that menaces with spikes of awesome and hanging rings of death.

BinaryBeast1010011010

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #46 on: May 30, 2012, 10:40:15 am »

Starver, about the cascading seeds, do you think they are taken from the result of a first prng then run through a second? I dont understand how the cascade (or butterfly effect, as I understand it) would oc
cur.
Thief^, why xoring #1 with #398? Does it serve any purpose to xor with an element this far away along the string or is it arbitrary?
After the 624th iteration, to avoid creating the same string of random numbers, do you use it as the seed for the next batch?
Logged
cant stop playing DF?
 : (){ :|:& };:

Thief^

  • Bay Watcher
  • Official crazy person
    • View Profile
Re: Mersenne twister in df
« Reply #47 on: May 30, 2012, 10:46:07 am »

You store each generated number back into the 624-element list, so on the next run round it works off the previous numbers.

The xor'ing #1/#2 with the #398 helps to mix the numbers better I think. I can't find any reasoning behind the exact number.
Logged
Dwarven blood types are not A, B, AB, O but Ale, Wine, Beer, Rum, Whisky and so forth.
It's not an embark so much as seven dwarves having a simultaneous strange mood and going off to build an artifact fortress that menaces with spikes of awesome and hanging rings of death.

BinaryBeast1010011010

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #48 on: May 30, 2012, 10:59:23 am »

Then you can keep it running until you reach the 2^19937-1 iteration, but by then you would need a new seed to keep churning out numbers. And for the twister to work best you need the 19937 bit long seed to be perfectly random...
When the twister is used, does it run continuously,recording its results and looping around until 2^19937-1 or is it only "summoned" to churn out a set of 32 random bits (or more if needed)?
Logged
cant stop playing DF?
 : (){ :|:& };:

Thief^

  • Bay Watcher
  • Official crazy person
    • View Profile
Re: Mersenne twister in df
« Reply #49 on: May 30, 2012, 11:08:23 am »

For reference, this is what the number 2^19937 looks like in full: http://mersenneathome.net/mah_display_prime.php?p=19937
It's 6002 digits long. You won't generate that many random numbers in your lifetime no matter how fast you do it.

EDIT: You ask the generator for 32 bits at a time. It doesn't do anything unless you ask.
« Last Edit: May 30, 2012, 11:09:59 am by Thief^ »
Logged
Dwarven blood types are not A, B, AB, O but Ale, Wine, Beer, Rum, Whisky and so forth.
It's not an embark so much as seven dwarves having a simultaneous strange mood and going off to build an artifact fortress that menaces with spikes of awesome and hanging rings of death.

BinaryBeast1010011010

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #50 on: May 30, 2012, 11:16:34 am »

You have a point, but when a program needs a random, say, percentile die roll, will it be running the twister all the time from start just in case the need arises or would it run it only to get enough bits to have an hexadecimal number in 0-64 range?
« Last Edit: May 30, 2012, 11:23:52 am by BinaryBeast1010011010 »
Logged
cant stop playing DF?
 : (){ :|:& };:

Thief^

  • Bay Watcher
  • Official crazy person
    • View Profile
Re: Mersenne twister in df
« Reply #51 on: May 30, 2012, 11:25:53 am »

Why would you run it constantly? It won't make the numbers any "more random", if that's what you're thinking.
Logged
Dwarven blood types are not A, B, AB, O but Ale, Wine, Beer, Rum, Whisky and so forth.
It's not an embark so much as seven dwarves having a simultaneous strange mood and going off to build an artifact fortress that menaces with spikes of awesome and hanging rings of death.

Starver

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #52 on: May 30, 2012, 01:19:49 pm »

Starver, about the cascading seeds, do you think they are taken from the result of a first prng then run through a second? I dont understand how the cascade (or butterfly effect, as I understand it) would occur.
That is my surmising, but I could be wrong.

As a simple example, imagine we had a (totally awful) 'randomising' function that takes a number, adds three and makes it modulo 10 at each stage.  We seed it with the number five and start to populate a four-by-four master-grid with the results.  First request for a number is 10|5+3| = 8.  The next request for a number is 10|8+3| = 1.  The next is 10|1+3| = 4, etc, giving large-grid values of:
8147
0369
2581
4703

(The next number that would pop out would have been 6, but it won't ever be required, in our example.)

In the context of DF, I'd actually assume that we wouldn't be left with the bare-numbers, but would use those bare-numbers to generate a 'weathered' (and otherwise manipulated) terrain map, and if it ends up getting rejected then another base-map would be generated (starting with the '6', as we request more numbers from the function that we had previously thought had gone into retirement), until that does not reject.  And as long as we seed with the same seed, we'll get the same map (after the same number of rejections!) every time we start from scratch.

But, assuming we're working with the above grid (easily re-calculated, on demand, rather than having to load in from a save-file the final 'master layout' map composed of daughter data, granddaughter data, or greatn-granddaughter data), then let's say that the player is embarking/adventuring in a location so that the game needs to investigate more thoroughly the finer details of the '8' square in the lower-right of the middle, to populate it with a 'finer', 2x2 grid.  The '8' is taken as a grid to produce:
14
70
Of course, '1' at this level of grid need not (will not!) mean the same thing as the '1' at the top-level of grid.  It's just a number, from which local qualities are obtained.  And I also only made it a 2x2 sub-grid because it fit best when replacing the larger number.

It fits within the 'major' grid, as follows:
8147
0369
25
14
70
1
4703


But note that every tile (under this simple scheme) that is given the value '8' at the top level of resolution would resolve down to the same lower pattern...
14
70
147
0369
25
14
70
1
4703
(A similar thing happening to every other number that can be found more than once on that grid.)

But, of course, with a large enough pool of numbers in a PRNG output (0..216?), which also happens to be far better at having results that 'leap around' the number-line, you'd be have more than enough numbers to prevent either two spots on the same top-level map having the same number, or make it too obvious that any particular spot on one map (through one seed) is essentially the same as any particular spot on another map (via a different seed).

Also, note that the macro-level square to the right of the '8's is always '1', and below the '8's is always a '0', but that's in part because I chose numbers with factors in common for my modulo-arithmetic and grid-sizes (choosing (different!) prime numbers in all directions would have been a good start, but I wanted to leave this arbitrary example 'leaky' with such problems, without being overly so, e.g. with 16|n+4| chosen as the PRNG method).  In this event, if an '8' somehow signified a particular kind of terrain, then such rules such as "when abutting an '8' with a '0', paint that side in a particular way", and repeated for all other combinations (including a rule for "when a '0' abuts an '8'" that compliments the melding on the other side of the boundary), would give the same appearance as well.  But if all the '8's are surrounded by different numbers, then this should influenced the end result differently and thus a loam-river-plain (if that's what the top-level '8' means, in this simple example) that in one case has a desert bordering it and in another a marsh might well be given details that are different to each other.  And not just at the edge, although that's where the primary procedural melding occurs.  If the number assigned to any area is combined (in whatever way you care) with the numbers of adjacent areas, for some seeding purposes, then you might be able to hide the formulaic nature of the tiling even better...

I'm not saying this is how DF does it (and, in various ways, I know it does not!).  But it's how I might do it.  In fact, in a couple of toy applications of my own design, I have done something similar, but the results were mixed.  One was particularly good at what it was supposed to do, while another obviously just wasn't right for this particular method.  I have no doubt that Toady has spent a lot of time refining the geography/geology-generating algorithms (as has Notch, the Minecraft guy, who has the same sort of problem and may have used the same family of solution to it[1]), so whatever method used is undoubtedly far better than a simple scaling-up of the one I've just demonstrated.


Quote
Then you can keep it running until you reach the 2^19937-1 iteration, but by then you would need a new seed to keep churning out numbers. And for the twister to work best you need the 19937 bit long seed to be perfectly random...
I'm wondering how much more understanding might be conveyed if we can swap the phrase "perfectly random" to being "goshdarn ineffable"?  Randomness from algorithms is an illusion.  I'd even hesitate to try to replace it with the word 'unpredictable' (as, obviously, one could repeat the exercise and know that the numbers one is going to retrieve are the same numbers one already has).  But the bigger that the internal state being operated with is (in some ways proportional to how big that Mersene prime is) the more different points upon the roller-coaster that is the PRNG output, you could be at, and the longer it takes to get to the point where you're getting a rehash of old values.  To quote Thief^, "...such an insanely long time that it might as well never repeat."  Note "might as well never repeat".  It still does, but I wonder if you know how frequently you're going to need to grab a number in order that you can spot a repeat within the age of the Universe?


[...]will it be running the twister all the time from start just in case the need arises or would it run it only to get enough bits to have an hexadecimal number in 0-64 range?
I don't think you understand the principle at all.  The number 'jumps around' enough between each single invocation ('randomly', you might say, although that's wrong) that you only need to ask if for a 'random' number when you need it.  You don't 'set it running' and tap it.  You'd get no more a 'random' set of results (although a different one!) to consult a background process that's 'ticking' out random numbers (whether required or not).  And at each and every time you asked for a number, you'd be best to take the output number and make it become a number in the 0..64 (or do you mean 0..63?) range.  Typically, the PRNG functions are capable of returning a number that is a fraction that is 0<=n<1 (or whatever inequality it is, check your language/library spec, like I really ought to check mine!).  Something like "int(n*64)" applied to that function would give you a (very roughly!) equal chance of getting each of the numbers 0..63.  Although there's usually also a method to call "rand(x)" to get either 0..x-1 or 1..x or 0..x or something (again, check your documentation) from the function, which does the messy work for you in a bit of code that easier for a human like ourselves to both write and read.

You almost certainly wouldn't be asking a randomising function for a random number that's (say) 0..255, and then if n>63 rejecting it and asking again.  If it is giving you something like 0..255, use that number integer divided by four.  If you get one 0..65535 then similarly divide it by 1024 to bring it down to 0..63.  Whatever is needed.

And if you're writing your own MT function (don't know why you would, to be honest, but don't let me stop you) you should be able to build in the "rand(x)" functionality, and/or the 0..1 fraction one, based upon the potential magnitude (bit-depth) that you know the 'presentable' random number from the MT-functionality is going to be.


[1] From what I understand, he has a near-infinitely extending Perlin-noise-like 'macro' geography-generator, in place of Toady's simulated rainfall/etc situation, but if I'm correct that Toady then creates the specific local geography with a cascaded seed (and influences from adjacent tiles) forming the minutiae of the local geography, then I might also be right in imagining that Notch has a similar generational system for his own volumetric macro-regions.  And if I'm wrong with one, I may still be right with the other. ;)
Logged

Anathema

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #53 on: May 30, 2012, 02:14:24 pm »

I'd even hesitate to try to replace it with the word 'unpredictable' (as, obviously, one could repeat the exercise and know that the numbers one is going to retrieve are the same numbers one already has).
The number 'jumps around' enough between each single invocation ('randomly', you might say, although that's wrong) that you only need to ask if for a 'random' number when you need it.  You don't 'set it running' and tap it.  You'd get no more a 'random' set of results (although a different one!) to consult a background process that's 'ticking' out random numbers (whether required or not).

Considering these two concepts: wouldn't leaving a background process running and continuously generating "random" numbers, then taking the latest one whenever your main thread needs one, solve the "unpredictable" problem? You'd be getting numbers that are no more inherently "random", but they would be different each time you ran the program even with the same seed, making your psuedorandom numbers no longer predictable.

Essentially you'd be setting up a race condition; something we usually try to avoid since they make it impossible (or at least, impractical?) to predict what the program will do, but since unpredictability might actually be a benefit in this situation, a race condition could be desirable. Of course this assumes you want unpredictability even with the same seed - it costs you functionality such as being able to regen the same world in DF given the settings and seed.

Not that I'd seriously suggest such a method unless you really need that unpredictability; leaving a background process continuously generating random numbers is like something you'd do to deliberately waste resources, say if you were stress testing a processor.

And, come to think of it, you could achieve the same effect by basing your starting seed on, say, current system time or page faults in the last minute or whatever. Either way it injects that element of unpredictability each time you run the program, but without a wasteful background process, and with the ability to repeat your results (assuming you record the seed) if you want to for some reason. So all of the above is really just a thought exercise to answer the question "is there any situation in which running a background process generating random numbers would be beneficial?" Unless I'm mistaken the answer is "yes, but there are better ways to solve the same problem."
« Last Edit: May 30, 2012, 02:31:18 pm by Anathema »
Logged
The good news is that ghosts die of old age.

Starver

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #54 on: May 30, 2012, 04:37:25 pm »

So all of the above is really just a thought exercise to answer the question "is there any situation in which running a background process generating random numbers would be beneficial?" Unless I'm mistaken the answer is "yes, but there are better ways to solve the same problem."
Before I read further down your post, to this end point, this is more or less what I was going to respond to you!

You want repeatability?  Do not let your PRNG run loose, independent of the main program.  You want it unreplayable?  Don't bother to do it that way; best just to get a instantaneously unique seed that you don't bother to (or can't) remember in any way.  And both of these eventualities are covered internally, IME, by most[1] programming languages you could be using, without having to worry about it too much.

I'm sure there's better ways of putting the above paragraph, but I know YGTI.


I'm still not entirely sure what BB666 is wanting, of course.  (I'm betting he's not working either for GCHQ or an electronic Roulette-table company.)  And yet I'm also still half expecting someone who really knows PRNG maths to come along and tell me that I'm utterly wrong about some niggling (or not-so-niggling) point, anyway. ;)  So far, the likes of Thief^ and the others who have played their part and obviously know at least as much as I do about this haven't complained at my interjections, which I consider indicative that I'm not too far off. ;)


[1] The exception being, IIRC, the BASIC machines of old that I used that only had dedicated RND()-generator that looped in the background from the moment of switching on, so essentially was the race-condition version.  Only it was probably a hardware provision (on a clock-related chip), so the effort to produce the numbers didn't impact the (by today's standards, positivity primaeval!) processing speed of the 'normal' processing.  Which was usually a matter of BASIC being translated a line at a time, in the first place, unless you'd bothered to entered your code directly in Assembly...
Logged

Anathema

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #55 on: May 30, 2012, 06:43:50 pm »

Before I read further down your post, to this end point, this is more or less what I was going to respond to you!

Yeah, I seem to have answered my own question moments after posting it, but I thought the post was worth leaving there in case the same thought occurs to anyone else.

You want repeatability?  Do not let your PRNG run loose, independent of the main program.  You want it unreplayable?  Don't bother to do it that way; best just to get a instantaneously unique seed that you don't bother to (or can't) remember in any way.  And both of these eventualities are covered internally, IME, by most[1] programming languages you could be using, without having to worry about it too much.

Well, that's the thing, isn't it - all these points are moot, since nearly any programming language you would care to use has built in functionality to churn out adequately psuedorandom numbers, if that's what you want. And if you absolutely need truly random numbers for some arcane purpose, no algorithm is going to do it, you need a webcam and a lava lamp (or just a web cam with the lens cap left on, apparently..).

Of course BinaryBeast doesn't seem to be looking to import a library and use someone else's RNG algorithm; he seems to be trying to understand where the numbers come from, and how close to truly random they are. Which is a strange thing to ask, because all practical evidence points to the answer "close enough for the purposes they're used for," and I doubt any of us writes RNG algorithms for a living and needs a better answer than that. Ah well, what are forums for..
« Last Edit: May 30, 2012, 06:57:05 pm by Anathema »
Logged
The good news is that ghosts die of old age.

BinaryBeast1010011010

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #56 on: May 31, 2012, 03:07:58 am »

I understand better the concept of cascading seeds, I stumbled across a pdf last night, "goodpractices" (which prng and how to use it in bioinformatic and genetic programing). When I read the excellent (at least for the neophyt I am) "a fild guide to genetic programing" I started to wonder about the gening of the starting genpool
And how you could avoid pitfalls. The prng seemed to be what I was thinking about but by then I did not know anything about them (and would you have asked me I would have been quite a skeptical asshat and answered "go get yourself a geiger counter and a ticket for tchernobyl if you want true randomness... Or record all your last pen and paper rpg night die rolls!")
To get back to my pdf "goodpracticeprng"
One important thing was about how you seed your prng. For a MT one could use another prng like KISS?

If anyone knows KISS around here, why hasnt it taken the crown from MT? It seems to be much less complicated (I dont understand what you do with y, are these ">>" meant to be read as bitshift?). Furthermore it seems to need a lot less lines of code.
On another track, I read about something called WELLRNG but they only said that it "could" end up being more used than MT)
Logged
cant stop playing DF?
 : (){ :|:& };:

Thief^

  • Bay Watcher
  • Official crazy person
    • View Profile
Re: Mersenne twister in df
« Reply #57 on: May 31, 2012, 04:01:18 am »

Yeah >> is right shift, << is left shift, ^ is xor, and <<= >>= and ^= are assigning versions: (a ^= b) == (a = (a ^ b)). It comes from the syntax used in the C programming language.

KISS has a much much shorter period than MT, ~2123 instead of 219937. There is a 127-bit variant of MT with a period of 2127 if you need something smaller than the full MT (but based on the same theory).

EDIT: WELL is supposed to be pretty good, but I don't know much about it.
Logged
Dwarven blood types are not A, B, AB, O but Ale, Wine, Beer, Rum, Whisky and so forth.
It's not an embark so much as seven dwarves having a simultaneous strange mood and going off to build an artifact fortress that menaces with spikes of awesome and hanging rings of death.

BinaryBeast1010011010

  • Bay Watcher
    • View Profile
Re: Mersenne twister in df
« Reply #58 on: May 31, 2012, 04:33:18 am »

@starver : then you decide what will be in the square using the square value on the world map as well as the adjacent squares values. Seems pretty understable for df to work on a variation of this algorithm, still it need to use a lot of memory. One day shall we see df used to benchmark super computers?

Thief^, x,y and c being the seed values you could use anything for them, but there should be a floor limit under wich value are too small right?

As I understand it, after choosing x y z and c you use a fixed value for "a" dont you? Do this value stay unchanged as part of the algorithm or is it arbitrary determined with x y z c?
The process xor y with y<<13, xor it again with y >>15 and a last time with y <<7 use y value at first then the xored value from last xor is used for the next? I dont think I get what means the = relation in this case...
Spoiler: code paste (click to show/hide)

Could multiple runs of this algorithm be used to seed a mt?
Logged
cant stop playing DF?
 : (){ :|:& };:

Thief^

  • Bay Watcher
  • Official crazy person
    • View Profile
Re: Mersenne twister in df
« Reply #59 on: May 31, 2012, 04:42:13 am »

= is assignment, not equality, in coding. So y is replaced by the xor'd version, and the new y is used in the next calculation.

Yes you could use it to seed MT, but most implementations of MT include a seed generator already that works well enough.
Logged
Dwarven blood types are not A, B, AB, O but Ale, Wine, Beer, Rum, Whisky and so forth.
It's not an embark so much as seven dwarves having a simultaneous strange mood and going off to build an artifact fortress that menaces with spikes of awesome and hanging rings of death.
Pages: 1 2 3 [4] 5 6