Topic: Mersenne twister in df (Read 10606 times)

BinaryBeast1010011010 · « **Reply #15 on:** May 26, 2012, 02:12:07 pm »

Macchanger is the first example that came to my mind (since I have not been playing darwinbots for a while, the rand function came up after), I am new to linux, two weeks ago my girlfriend find out the hard way that someone has been downloading copyrighted material with her internet connection, I decided to learn more about it and after a week of constantly studying and trying to understand I learned that even a quite long key was ridiculous if you use wep. I'll say that I knew nothing of what was the later and only had an hint about the former before that week (duh? wep? isnt this some internet thingie?)...
I dont want to be taken seriously since I am a musician and have no skills in programming, but I can get quite obsessive when I dont understand something and I love to learn.
To get back to the crypto track, how the use of a truely random seed state could result in a predictable string of characters?

(I strayed from df, the combat use of random numbers was quite obvious -facedesk- but for terrain generation... I saw lovely fractal landscapes on the web, with moutains, rivers but I did not know that there was any element of randomness in fractals, quite the opposite I thought)

blue sam3 · « **Reply #16 on:** May 26, 2012, 03:48:21 pm »

Quote from: BinaryBeast1010011010 on May 26, 2012, 02:12:07 pm

Macchanger is the first example that came to my mind (since I have not been playing darwinbots for a while, the rand function came up after), I am new to linux, two weeks ago my girlfriend find out the hard way that someone has been downloading copyrighted material with her internet connection, I decided to learn more about it and after a week of constantly studying and trying to understand I learned that even a quite long key was ridiculous if you use wep. I'll say that I knew nothing of what was the later and only had an hint about the former before that week (duh? wep? isnt this some internet thingie?)...
I dont want to be taken seriously since I am a musician and have no skills in programming, but I can get quite obsessive when I dont understand something and I love to learn.
To get back to the crypto track, how the use of a truely random seed state could result in a predictable string of characters?

(I strayed from df, the combat use of random numbers was quite obvious -facedesk- but for terrain generation... I saw lovely fractal landscapes on the web, with moutains, rivers but I did not know that there was any element of randomness in fractals, quite the opposite I thought)

You get a different world every time you generate one. How, pray, do you think that's achieved except with a PRNG?

BinaryBeast1010011010 · « **Reply #17 on:** May 26, 2012, 03:52:47 pm »

I thought the seed was changed every time you gen a world, maybe using the mouse cursor movement or something else.
if you gen two worlds with the same seed (if it is possible) will they be the same?

KoboldWarmonger · « **Reply #18 on:** May 26, 2012, 04:07:27 pm »

If you're trying to keep people off your girlfriend's wifi, use WPA2 with a nice long key (>12 characters) and an SSID (network name) that isn't assigned by an ISP or otherwise terribly predictable. WEP is easily broken even if you use a nice long super-secret-and-random key.

WPA2 isn't perfect but it'd be a serious computing pain in the ass to gain access if you follow those guidelines.

EDIT FOR RELEVANCE TO DF: If you gen two worlds with the same seed they SHOULD be the same, that's why it shows you the key. They might not actually be due to wierdness across different operating systems and stuff- that may not be an issue in the current version, but I can see it happening.

Starver · « **Reply #19 on:** May 26, 2012, 05:08:57 pm »

I can confirm (because I've done it... in a professional capacity, and with much the same tools as you've been using, BinaryBeast) that WEP is trivial to get through. Just by listening, and the key becomes obvious. WPA2, however, without a dictionary-attackable password is essentially secure, no matter how much traffic is captured. Which is not to say that in the couple of years since I did this that they haven't found a way past that as well, but that was the situation...

And I can't think of anything else On-Topic that hasn't already now been said. Oh well.

BinaryBeast1010011010 · « **Reply #20 on:** May 26, 2012, 06:25:24 pm »

I used the aircrack suite to try and get in, it is pretty easy once you get used to it. For a first try I must say that I really like linux. I tried myself at the "metasploit unleashed" tutorial but could not find the so called "dradis framework". I muste be doing something wrong since I use backtrack5 r2 kde 32 and it should be there...
About metasploit there are a lot of things I dont understand and the tutorial's writer seems to assume them as widely known so he does not explain them...

KoboldWarmonger · « **Reply #21 on:** May 26, 2012, 06:29:47 pm »

You should take your questions to a network security forum, or maybe XKCD or something if you want somewhere a little less intimidating- this is the wrong place to be.

BinaryBeast1010011010 · « **Reply #22 on:** May 26, 2012, 06:43:19 pm »

I find these places more intimidating in fact. I tried to post on the backtrack forum but I could not create any new thread, post on old ones or even try and pm the moderators. This is quite intimidating.

Shinziril · « **Reply #23 on:** May 27, 2012, 12:51:44 pm »

For those who require lots and lots of truly random numbers, there's things like this, which generates random numbers based on light passing through a semi-transparent mirror.

Or you can use http://www.random.org/, which generates them out of atmospheric noise.

blue sam3 · « **Reply #24 on:** May 27, 2012, 02:35:40 pm »

Quote from: BinaryBeast1010011010 on May 26, 2012, 03:52:47 pm

I thought the seed was changed every time you gen a world, maybe using the mouse cursor movement or something else.
if you gen two worlds with the same seed (if it is possible) will they be the same?

The seeds (there are several) ARE changed every time you generate a world, according to the output of a Mersenne twister. The same procedure is used for the other randomised stuff.

Miuramir · « **Reply #25 on:** May 28, 2012, 11:04:03 am »

Quote from: BinaryBeast1010011010 on May 26, 2012, 03:52:47 pm

I thought the seed was changed every time you gen a world, maybe using the mouse cursor movement or something else.
if you gen two worlds with the same seed (if it is possible) will they be the same?

DF is an excellent example of what is generally referred to as Procedural Generation, or Procedurally Generated Content (PGC). I recommend reading the Procedural Generation article on the main wiki, and for more interesting detail there's a more specialized Procedural Content Generation wiki being gradually developed.

For Roguelikes and games like DF, PGC is used for several reasons; one of the important ones is that it serves as an extraordinarily efficient compression algorithm for worlds, dungeon levels, biomes, artifacts, and the like. You need three top-level components to make this work: a starting seed, a pseduo-random number generator (PRNG) with certain qualities, and a (set of) algorithms which can generate a wide variety of interesting worlds by "random" generation... using table look-up, fractals, Perlin noise, and various other methods.

The process is loosely that you start with your seed, and use it as the first input to your PRNG. Every time the algorithm needs to generate a "random" result, it uses the next number output by your PRNG. (This is usually first processed internally in such a way as to be a real number between 0 and 1, and then further multiplied out to give results in the desired range. For instance, if your imaginary algorithm says that each dwarven civ has an 80% chance of knowing how to make high boots, when generating a dwarven civ, if the next "random" number is between 0.000 and 0.8000, they do; and if it's between 0.8000 and 1.000 they don't.) Assuming you have chosen a PRNG with the right qualities, the results will be much closer to random than, say, rolling a set of percentile dice; yet if you have all three of the unique conditions (full knowledge of the seed, PRNG, and algorithms), reliably repeatable.

How you get the seed in the first place is a separate question. In some games, this is generated from internal randomness sources; these are usually some combination of slow and mathematically suspect, but you only need one (or a few) numbers at the start from this source. In some games, they are supplied by the level designer; they use various exploration tools to generate a bunch of levels / worlds and save the seeds for the ones that work well. In some games, they are provided by the user. DF uses an interesting mix of all three; if you generate a random world, a random seed will be provided; but you can simply type in your own, or collect interesting ones that other people have found and shared on the forums.

Note that to generate the same result, you need the *exact* same series of rolls in the same order. This is why sometimes when Toady makes a change to worldgen DF, it causes seeds to no longer generate the same results, but not always. For instance, using the made-up high boots example above, if Toady decided to change the odds of a civ having high boots from 80% to 75%, it probably wouldn't affect much early on except that perhaps a civ that was right on the edge would no longer have the boots (for instance if the random number generated at that point was 0.7982, under the old rules they would have them, under the new rules they would not).

Howerver... you could get highly unpredictable ripple effects later; perhaps a worldgen hero from that civ was bitten on the lower leg by a were-beaver early in their career. In the original world, the blow was deflected by the high boot, the hero defeated the were-beaver, and they went on to be a great hero of the dwarves, elevating their mountain home to be a pinnacle of dwarf-kind. In our new alternate reality, the were-beaver latches on, infects the hapless hero; he still climbs the ranks of fighters, but his eventual throne is a weak and corrupt perversion, eventually loosing to the elves when they muster their fast-breeding immortal armies against his constant demands for wooden floodgates and pump components.

Thus, DF is a worked example of the Butterfly effect turned all the way up. In much of the real macroscopic world, butterfly and quantum effects damp out statistically; PGC can have the interesting effect of amplifying these tiny differences into radically different "alternate histories".

DF, unusually for the genre, has player-accessible multiple streams of PRNG for different purposes; SEED (used for the world itself), HISTORY_SEED (used for some history purposes), NAME_SEED (used for name generation), and CREATURE_SEED (used for creature generation). This is so a player can, for instance, decide that they really like the look of the landscape, but the dwarven civs end up with lame names, so they'd like to be able to regenerate with all the other seeds the same, but a different NAME_SEED.

Internally, that means that DF has (at least) five different "random" number calls; one that produces "unseeded" (seemingly to the player) random numbers based on system randomness for use in generating seeds in the first place, and for functions that do not need to be predictable or repeatable; and one each for the four configurable seeds, which produce the next pseudo-random number in the appropriate sequence. One occasional source of error in programs like DF is mistakenly calling the wrong one; this can lead to difficult to diagnose bugs. Note that small games are not the only one vulnerable to this; I'm aware of at least two large commercial MMORPGs that had major persistent bugs that existed for a long time, that were eventually traced to a snippet of code that didn't properly use their "random"-number generator (initialization problems, or called the wrong one for the purpose).

A side note: in professional cryptography circles, the general rule of thumb is "don't try to be clever". The number of cases where someone who thought they knew what they were doing and could optimize, simplify, or improve something and ended up actually breaking the desired random behavior and leaving a weakness is huge. The Mersenne Twister is a pRNG with very good properties for gaming use, and pretty good general applicability for most non-crypto purposes; it's the default in some newer language libraries for good reasons.

Starver · « **Reply #26 on:** May 28, 2012, 03:26:25 pm »

That's a good explanation, there. There are some interesting considerations I'd like to add, though...

Quote from: Miuramir on May 28, 2012, 11:04:03 am

How you get the seed in the first place is a separate question. [..] In some games, they are supplied by the level designer;

Somewhere in a similar vein to this (extracted) example, is something of from my experience.

My first high-level programming (ignoring some boot-strapping dip-switch-entered Machine Code, on some kit-type computer that I don't even remember the details of) was in BASIC, and of course I was used to using the RND() function, which self-seeded itself with "microseconds since the computer was turned on", effectively random through 'environmental' noise, given that the time to type in CH. "" (or especially entering the program line by line) was obviously going to make this auto-seeding (or point in the rotating sequence generator's cycle) essentially different every single time.

However, my first proper language course (as in a compiled language, not a translated one like BASIC) included early on the programming of a "cave game", with a 'randomly' generated floor/ceiling (pixel-wide lines from bottom/top of the display, towards the centre) through which a little plane-type thing was directed by the player to pass through without fouling foul of the simple collision-detection algorithm ("is any part of the plane-type thing being drawn on pixels that are non-black in colour, hence has hit the cave floor/ceiling"). It took a while for me to work out (it was never explained, that I recall) why the randomness function always produced the same shape of first cave, second cave, etc. Unless the randomness was used to do other things (colour the non-space pixels, add other features, etc), but even then would be consistent between each and every time this version of the cave-flying program was run.

Of course, the answer was that this system had a "random()" function that if used without any precursor would always initialise the first time (in any given program-run) with the same value. Probably zero. And so the same, repeatable, sequence of 'randomness' always came out of the this process. Unless we went in instead of it being the first 960-or-so random numbers being used in alternating sequence between the floor height (x480) and the ceiling depth (x480), the odd-numbered ones alternated floor and ceiling, while the even 'randomness' numbers got put into colours. Or whatever the whole thing was about.

In many ways, this was "a sequence chosen by the designer", but really it was all thanks to the person who sorted out the original code samples for the course. Had it come up with something impossible/awkward/whatever, I'm sure they'd have redone their example code to "srand(1)" (or equivalent) at the start, or something else that would have given a working result.

Quote

Note that to generate the same result, you need the *exact* same series of rolls in the same order. This is why sometimes when Toady makes a change to worldgen DF, it causes seeds to no longer generate the same results, but not always.

One thing I've done, in procedurally-generate worlds, is to maintain 'parallel seed-trails'. I'm sure there's a better term for it, but let me explain. If I'm populating a number of planets, with a number of different environmental qualities in a way that I wish to predict, then I would use a seed (fixed at startup, either something arbitrary like zero, to later work with the results to make them go nicely, or by choosing a seed that gives a result that looks good) that assigns a value to each planet in turn (or whatever), without being tasked towards any of the surface features. This value assigned to the planet is then used to initiate that planet's own instantiation of a PRNG sequence engine.

So, let's say I have ten planets, I generate ten (repeatable) values, one for each. These values then seed the PRNG used for that particular planet, and right now I only have two things I'm worried about: colour (planet-wide) and scale. But then imagine that I want to add something else, to each planet, like a given number of moons. I now ask the planet's own PRNG sequencer to give me a value that I can convert into moon-numbers, after I've asked about colour (as before) and scale (as before). The new number request neither changes the colour or scale of the planet nor (because it is not queried of the 'master' PRNG sequencer) propagates changes of any kind for the later planets. As it would do if (instead) I'd previously requested 20 values from the sole fixed-seed PRNG (planet 1 colour, planet 1 size, planet 2 colour, planet 2 size, ...) and had started asking for 30 values, almost immediately mismatching from the original sequence (p1c, p1s, p1m (!p2c), p2c (!p2s), ...).

Should I wish to apply even more complication to my data extraction, such as go for a colour gradient for each planet (pole to equator, and back), I could take the 'traditional' planet colour value and (possibly while using an algorithm that noted the 'seed colour value' and used that to keep the 'average colour' the same as the original single-colour for the planet) used that to spawn a further sub-seeded PRNG which gives me some further procedurally-generated values to guide the colouring. Or I could just request after "colour", "scale" and "moons" values, one (or more) further values related to the gradient. But that'd be messy. (And I would have to add it after, to avoid disturbing the rest of the dynamics.) Similarly, the moon(s) probably need to be assigned orbital parameters, and

There are some issues in doing that, but assuming that one is, at any one time, only viewing a limited amount of the surface and skies of a single planet, the complications behind maintaining multiple PRNG-sequences is going to be only somewhat of a problem. And note that rather than using the "1" in the "there is one moon" value or the "2" in the "there are two moons" one, to seed the sequences that give the moon(s) involved their positions in the planetary skies (which would mean that every one-moon system looks the same, and every two-moon system is similarly alike with its kin), it'd be the raw number (0.000-0.099 for one moon? 0.100-0.199 for two..? ..whatever the scheme concerned), so that variation still exists. Unless you're happy with the outcome of someone being "procedurally the same" for any given leaf-seeding.

Also, you don't need to run through the minute details of planets 1...5 and 7...10 (to extract their 'random' qualities from the stream) when needing to generate planet 6's environment. You just need the Planet6 seed value. And if there's a sub-divided planetary surface, with similarly trickling-down seed-values dictating details of the terrain, you need only work out the seed for the hemisphere you're in, then the portion of that hemisphere that you're in, and the sub-portion, etc... And apart from when you can not bother with any macro-regions (at whatever scale), you are not in contact with. (Obviously, near a boundary, you'd be simultaneously tracking both your own location's PRNG-cascade and any subset of neighbouring PRNG-cascades are also within the perview of your viewing-window, but it's still not the entire planetary surface, unless you're viewing a world-map at which point it's probably a breadth-hungry set of cascading PRNG values, but only to the depth needed to give data at the resolution of that map.)

Anyway, all the above was originally supposed to be about another reason why seeds might break. The PRNG is tasked to a different set of data, between version, and so the mappings between values and results loses cohesion. e.g. even though the chance of High Boots remains the same, an inserted prior check for Kinky Boots picks up the 0.8-or-lower value (whether that makes it true or not) and the High Boots check gets the next value... which could be anything.

Of course, such insertion of demands for a number from the sequence makes everything that follows potentially different (not just the High Boots), I have no idea if anybody's done experiments with pre-arranged fixed seeds worldgenning worlds with/without additional clothing modded into the raws, though, to see if this sort of things happens in DF, though.

One thing I'm not convinced of, however:

Quote

Assuming you have chosen a PRNG with the right qualities, the results will be much closer to random than, say, rolling a set of percentile dice;

It is possible to roll percentile dies 100, 1,000 or 1,000,000 times in a row (albeit increasingly and exponentially unlikely!) without getting a value above 20 (say, for a given desired result). Desperately unlikely, but possible! However, any PRNG can only have a limited length of cycle of output results. At best, one for every possible internal state[1]. And possibly an algorithm could find that starting with a seed of zero gives it one (eventually) repeating output sequence while a seed of one gives it another, with who knows what share of the theoretical total of the outputs.

Anyway! Given this, there is a cycle-length, and (depending on how you interpret the individual values) there is not room for all possible combinations of 'simulated' random results in that sequence. Imagine that you have a guaranteed million 'novel' values before repeating, there's not room in there for the (unlikely, but possible!) half-a-million results of one kind, in a row, plus half-a-million results of another kind, in its own row, plus half-a-million results alternating from one to the other... Never mind all the other (less easily described) 'random' sequences. No matter how you ascribed the mapping from PRNG output to result-type (consistently).

Obviously, that's an out-there example, but it can be worked out the maximum length of results that can (by landing on whichever part of an assumed optimal result-sequence you wish) be obtained in every single possible combination of said length. (Obviously, the smaller sequences can occur more than once.) Although even though the individual results (and even result-pairs and result-triples, etc) might be statistically equally likely, at the upper limit one might find that there is no example of the Rosencrantz And Guildenstern Are Dead coin-toss results... (And, yet, maybe its inverse is there?)

However, I know that to be a picky point. Insofar as gameplay is concerned, then either the sub-sampling of the sequence is short enough that the presence/absence of any particular sequence does not become particularly noteworthy or the not-quite-randomness that remains is hidden behind further algorithms that obfuscates it ('noise-like', and also beyond the unknowing player's immediate perception) so that it never appears as extraordinary. And, in general, a (say!) if there does not exist a seed for a Minecraft world that produces a completely flat desert across one million tiles'-worth of surface, with cactii arrayed across it at distances of exactly ten squares distance for the entire area, then I still can't see a problem. Should it be possible, and someone happens across the said world, then I would more suspect Notch of actually hiding this 'possible' result as an easter-egg (e.g. along the lines of "if seed_string=='HappyEaster' then override(desert,1000x1000,flat) && override(cactiigrowth,separation:10,alldesert)" or however it may be coded) than be happy that such an improbable result has popped up (necessarily at the expense of other more 'noisy' results that would have been possible!).

Still, it's interesting to ponder. For example, shall I ever see DF terrain appear before my eyes with the words "Hello Starver!" spelt out in giant ASCII-art through one terrain feature-type or another (like tree-growth patterns, on a plain), in a given worldgen? Possible... Eminently possible. Given all possible results. Although I suspect something less clear or comprehensive (of the "one monkey on one typewriter for only half of eternity" kind) is all I could actually expect, however much my heart might be set upon it.

(Plus, I realise that my "parallel seed trails" method, as mentioned about fifteen paragraphs up, greatly reduces the potential distinctiveness of sub-elements of such a generated universe, especially given my comments about internal and external resolutions of a PRNG state-machine, but I also get the advantage that if I inspect the totality of the results and spot something dire (e.g. a planet that ends up with all its mountains formed as candy-stripes around it, when that would be patently and obviously unnatural), I can adjust just that aspect of my procedural generation until I'm happy with the new look of that aspect. In an emergency, insert an "anti-Easter Egg" that tells Planet6's geography to be governed by the unused Planet111's geography seed, but keeps all the rest as they were. But what a hack that would be.

)

[1] Not necessarily limited to the magnitude of the output's resolution, and in fact there's good reasons to want to have better than the 'resolution' of the outputted random number, so that you can't in any way anticipate that binary random number 01101010111000101010101011001000 is always followed by binary number 10100011111001101010101001100011... this might well not be true if the internal state is a binary number twice as long, and you get the least significant half of the internal bytes[2] as an output)

[2] Or the most significant half, or every other bit from the sequence, or every bit-pair XORed together... whatever it is your algorithm likes to do to the internal state for output, somewhat regardless of what it plans to do to the internal state to make it into the next internal state.

BinaryBeast1010011010 · « **Reply #27 on:** May 29, 2012, 03:17:57 am »

ptw, this is amazing

I am going to print this thread as a pdf I think ^^

Thief^ · « **Reply #28 on:** May 29, 2012, 03:58:45 am »

Mersenne twister is a particularly good random number generator because it has a very large internal state, so in a long run it can produce the same (output) number quite a lot of times without then being followed by the same sequence. It also has the advantage of being able to produce large random numbers (32 and 64-bit).

In comparison the standard random number generator in most programming languages (particularly C/C++) is the "linear congruential generator", which is very quick and simple but "a bit crap".
e.g.: Microsoft Visual C++'s version can only output 15-bit numbers.
GCC's apparently outputs alternate odd and even numbers.

SmileyMan · « **Reply #29 on:** May 29, 2012, 05:41:36 am »

Implementing a mersenne twister is easy, just use the Boost::Random library

Why write code that someone else has already written better?

News:

Author Topic: Mersenne twister in df (Read 10606 times)

BinaryBeast1010011010

Re: Mersenne twister in df

blue sam3

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

KoboldWarmonger

Re: Mersenne twister in df

Starver

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

KoboldWarmonger

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Shinziril

Re: Mersenne twister in df

blue sam3

Re: Mersenne twister in df

Miuramir

Re: Mersenne twister in df

Starver

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Thief^

Re: Mersenne twister in df

SmileyMan

Re: Mersenne twister in df