Topic: Mersenne twister in df (Read 10600 times)

BinaryBeast1010011010 · « **on:** May 25, 2012, 03:02:58 pm »

I read that df uses a mersenne twister, after reading the wikipedia article I cant help but admit that :

1)I dont understand how it works (please someone explain me)
2) how the heck is it used in dwarf fortress??? I must be broad and dumb but I cant see...

PTTG?? · « **Reply #1 on:** May 25, 2012, 03:19:26 pm »

The twister is use behind the scenes to generate random numbers.

BinaryBeast1010011010 · « **Reply #2 on:** May 25, 2012, 03:21:57 pm »

I understand that but how are the random numbers used?

Uristocrat · « **Reply #3 on:** May 25, 2012, 03:41:11 pm »

Quote from: BinaryBeast1010011010 on May 25, 2012, 03:21:57 pm

I understand that but how are the random numbers used?

They're used to make random decisions. For example, during worldgen, to decide whether or not to make a river, mountain, etc. Or when in combat, to decide if an attack will hit or not. Say that your dwarf should have a 70% chance to hit. You generate a random number between 1 & 100. If you get any number between 1 & 70 from your RNG, you call it a hit. The numbers between 71 and 100 count as misses. So the random numbers help you tell the computer to do things sometimes, but not always.

Think of it as telling your computer to flip a coin to decide what decision to make. The random number is, essentially, the result of your coin flip.

BinaryBeast1010011010 · « **Reply #4 on:** May 25, 2012, 04:26:03 pm »

ok I get it, now, does anyone know how the mersenne twister algorithm works?

Warmist · « **Reply #5 on:** May 25, 2012, 04:50:30 pm »

Yes: you put some numbers in, twist em around and get some number out. Basically it's a LOT of math, pure simple math. Also i don't like it and used multiply with carry when i needed to just because it's a lot easier to implement.

For more exact code see wikipedia (and it's references). But i'm pretty sure nobody needs that, it's enough to know that it makes pseudorandom numbers, has a long(very) period, and good dimensionality, and most importantly passes most randomness tests.

BinaryBeast1010011010 · « **Reply #6 on:** May 25, 2012, 05:20:11 pm »

I understand that it twist them around but :

Spoiler: quoting wall of pixel death (click to show/hide)

first, to xor dont you need the bits to be binary?
a here being the whole number and a(1) the segment used during the iteration?
and what does
a(i) = temp shifted right one bit xor
X'9908B0DF' if temp is odd xor
a(i+397)
mean?

Starver · « **Reply #7 on:** May 25, 2012, 05:38:28 pm »

Eveything in a computer is in bits. That's the nature of computers. (Well, your puny earthly ones, that is, my trinary-base super-space computer on the moon is far more... *cough cough* ...never mind.)

Do you mean, more than a single bit each? If so, the XOR (or other logical comparator) is performed on each equivalent bit in turn, and the result is the result bits brought back.

For example, 42 XOR 69, assuming the values are stored in single bytes, would be 00101010 XORed with 01000101, which would be the first bits XORed together (0 ^ 0 => 0; then 0 ^ 1 => 1, etc) to make 01101111, or 111 in decimal (ouch, should have chosen an example less likely to confuse).

Or is that not what you meant?

Warmist · « **Reply #8 on:** May 25, 2012, 05:43:08 pm »

First i probably know that you don't need to know.
Second this requires programming knowledge, because (almost) only programmers and mathematicians worry about this.

Answers:
xor works with bits that is true, but in programming everything is bits! (e.g. number "3" is "11" in bits, so saying "3 xor 5" is same as saying "11 xor 101")
a is an array of numbers, a sort of a box or a list of numbers, you could look into as a GIANT number (that is 624x32 bits thats one loooong number)
and yes in that sense a(i) is a segment that is used in interation, but it's easier to look as into 624 different numbers
and that last code part means what is written

shift'ing being an operation there bits are moved (left or right) in a number i.e. if we have a 32bit number (like these are) then 1 shifted left is "10" in binary or 2 in dec. And so on.
Seriously why do you need to know, if you want to learn about pseudorandom number generators i'm sure there are more easier ones doing just the same (except some corner cases). Usage in programming is usually "rand()" without any wories how it works, also understanding this will not tell you how Toady One generates terrain, features or history.

Got a bit ninjad

Edit: as for BinaryBeast you sure don't know what is binary, and what operations you can do with it... Trolling maybe?

BinaryBeast1010011010 · « **Reply #9 on:** May 25, 2012, 06:06:12 pm »

I know what is binary thank you, the use of xor for non binary numberssurprised me. I do not NEED to know, I'm just interested in the subject. I dont want to know how toady gen all the things (since it is under the hood and such things are not to be known).

I'm interested in mersenne twister because it seems more usable than blum blum shub. Until a while I thought the only way to generate truely random strings of numbers was using radioactive material, I heard about the middle numbers algorithm but it does not seem "random enough" for things like what df does.

about bitwise command I learned them playing around with darwinbots (I know thats no programing school but that's the best I have)
it's "temp" that I dont understand.

if it is secret knowledge though I wont bother you anymore

nice calling me a troll when I only look for teachings from more competent people.

Starver · « **Reply #10 on:** May 25, 2012, 07:01:29 pm »

Quote from: BinaryBeast1010011010 on May 25, 2012, 06:06:12 pm

I know what is binary thank you, the use of xor for non binary numberssurprised me.

It shouldn't do. As already mentioned, everything is binary in computers...

Which, as BinaryBeast666 by another name, I would have imagined you would have appreciated...

Also, you'll never generate a truly random string of numbers with any algorithm (with no external influence, such as radioactive materials, cosmic background radiation or a camera aimed at a magma lamp). But you can make something that's not trivially predictable, in various ways depending on whether you're looking to use it as a cryptographic function, or generate 'noise' that (in DF's case) is refined down to geographic and geological and historical features.

As to what "temp" means... It's a variable. Just like 'i', although it's been given a name that makes it obvious that it's temporary in some way.

Like a method to swap two numbers might be written as "swap(x,y)" and consist of (in some weird pseudolanguage that's a composite of several I use)

function swap(a,b:numbers inandout) {
temp = a;
a = b;
b = temp;
}

When this function (in whatever form fits the language you're writing in) has finished, the original values of 'x' and 'y' are swapped round, having been transferred to variables 'a' and 'b' within the function and then back again (some form of pass-by-reference, right, but just ignore the man behind the curtain for the sake of veracity in this case), and also we had a variable called 'temp' being used, but only ever needed within that function.

In the case of the 'temp' in the posted code snippets, you have 'temp' accepting a value from (a transformation of) a given value and then (after further transformation) being written back to the original value. Or thereabouts.

In a far simpler version, imagine "i = (i + 1) * 2" as "temp = i + 1" and then "i = temp * 2". But in the Twister variation, there's a conditional in the assignment. Imagine it as "i = (i + (1 if i is odd, else 2 if it's even)) * 2". A bit of a mouthful, so there's a split of "temp = i + ?" into "if temp is odd, add one, else add two" before getting on with the "i = temp * 2" bit.

There are computer languages that have convenient (if sometimes obfuscated) methods of doing that in one statement without any 'temp' variable, for example "$i = ($i+(($i%2):2:1))*2" (although you could also do that simple little trick by something more like "$i=4*int((i+1)/2)", although I might have messed up the mental arithmetic or some basic syntax in these examples).

To be honest, that code-snippet you give is strangely formatted. I can read it, but it's perhaps not so intuitive to a non-coder. Though it's very COBOL-like, in many respects.

Anyway, I hope that this helps, although you may want to start with something a bit simpler. I'm currently struggling with how to deal with an algorithm (I'm mapping a 3-sphere's surface to a more standard euclidean 3D volume using a kind of gnomonic projection, only with the extra dimension added to both source and destination), and tripping up over how to extend the left-handed/right-handed rules to more fingers. But at least I'm fairly confident with the programming language that I'm using, and it's just the maths that I'm not happy that I'm getting quite right.

BinaryBeast1010011010 · « **Reply #11 on:** May 26, 2012, 05:15:21 am »

You did help me indeed thank you. I do apologise if my first sentence seemed harsh.
The cryptographic use random numbers in one time pads always fascinated me, the problem is how to obtain them. How old ladies thinking bingo numbers was not random enough for cryptographic use is one example. Blum blum shub uses big primes but I think we can predict that in some years (if I am optimistic, by the end of this century or the next) we well be able to factorize their product so it will be useless. The future of cryptography (it's only my opinion) lies in one time pads (the quantic computer does, in theory at least, have a mean to securely transmit the keys). As one time pad is a perfectly secure encryption (provided the key is truely unpredictable) I hoped there also was a way to generate keys for them without using expensive radioactive thingies (numerized lava lamps are cool but if they could be replaced by an human written program wich only needs a truely random seed to create an infinite amount of random numbers it would be even better).

so once the first bit of a(1) followed by the last 31 bit has been shifted right and xored with 9908B0D, (if it is odd it is xored again with the remaining bits of the 623 bit long string)
we keep the result as first part of the random number string. if temp is even do the other bits just stay the way they are?

Warmist · « **Reply #12 on:** May 26, 2012, 06:19:39 am »

actually if you are using linux, there is special device called /dev/random that emits TRUE random numbers. It works by collecting various errors and timeouts and etc and then hashing them with some function. It's not as fast as pseudo random number generator /dev/urandom but secure.

BinaryBeast1010011010 · « **Reply #13 on:** May 26, 2012, 06:41:43 am »

As I understand it it does not start from a seed state so it isnt usable as a key generator for two communicating computers.
Is it the one used in backtrack with the "macchanger -r" function?

Starver · « **Reply #14 on:** May 26, 2012, 12:43:09 pm »

Quote from: BinaryBeast1010011010 on May 26, 2012, 05:15:21 am

The future of cryptography (it's only my opinion) lies in one time pads (the quantic computer does, in theory at least, have a mean to securely transmit the keys). As one time pad is a perfectly secure encryption (provided the key is truely unpredictable) I hoped there also was a way to generate keys for them [...]"

There's all kinds of ways to get a one-time pad. You can have someone flip a coin or roll a dice, really. (Indeed, you could even design a machine to flip the coin/roll the dice and record the results, as long as you don't make it so perfect that it introduces its own predictable patterns[1]!) But if you're generating actual one-time pads with an algorithm, you're doing it wrong.

The insecurity of one-time-pads is in the transmission/couriering of the information. If Eve gets a copy of the OTP, then she can listen to Alice and Bob freely. But a OTP with sufficient length to be useful for extensive communication could, if captured, be analysed and may have its underlying generating algorithm exposed... Thus (even if some key input elements to the algorithm in the next generation of the OTP) the pad that Alice and Bob share need not even be intercepted by Eve, because she's got a copy (or several possible copies), that can be used without having made any further attempts to temporarily 'borrow' the pad being used. This would never (or, as stated elsewhere, practically) be impossible using a true random source for each subsequent generation of pad.

Cryptographically, just use the algorithms straight. Let Alice encode using the algorithm and whatever key-settings are being used, and let Bob decode using the algorithm and whatever key-settings are being used. Best practice is that although you don't shout the algorithm to the world (when not making something like PGP which is intended to be used by anyone), you still start off with the assumption that the enemy knows the method of encoding. The key/parameters used to initialise/control the algorithm are the main secret. Also, it's easier for Alice to let Bob know that she's no longer using "123456" as the main key to the algorithm, and instead using "654321", if she suspects that Eve has gamed knowledge of that first key, rather than having to give Bob a new decoding process to adopt.

Also, all PRNGs start from some seed state, even if it's in-built and unalterable. Functionally, there's quite a lot of equivalence between a key for a cryptographic algorithm (which may even be the first few characters of the plaintext itself, although that seriously falls foul of the "assume they know the algorithm!" rule) and the seed for a PRNG that you're using to generate OTP-like sequences at each end (independently creating the same OTP based upon the same seed... that you can only hope is secure and not known by Eve Strupper herself).

The /dev/random solution, by the way, is generally very good, but eventually you may run out of 'entropy', in systems that do not have sufficient activity from which to derive the non-computed items of randomness used to fuel the /dev/random process. The /dev/urandom alternative actually (although it may depend on your system) allows that a (complex, and usually unpredictable, as long as you're not hammering it) re-use of the initial randomising sources can be used for far longer. So instead of saying "I can't help you", it presents previously-given random data in a new format. Like I said, it's not perfect (require it to be so for 1000 consecutive cycles, and a villain with enough patience or ability to make requests of her own might be able to work out what you're going to get next time, or at least the time after she has sought her own output set!), but it might help pad thngs out.

Alternatively, it's not unknown to take a given (small, but significant) amount of true-random data and use it to seed a PRNG function (in a process not that different from urandom operation) that can return (say) a length of data that is double or quadruple the original data in size but is highly-non-trivially predictable in itself. Although as your maths would clearly show, feeding a single byte into such an algorithm would never produce more than 256 different 4-byte 'hashes' in the final output, if that is how it set up. But each result may be intrinsically unpredictable in nature, in and of itself (at least until the 'byte-expander' algorithm is understood, in which case what might have been a 4-byte brute-forcing process becomes a 1-byte brute-forcing one.)

I have no direct experience of macchanger, BTW, but given the entropy contained in a MAC address, I'm not sure that a bit of randomness-extraction from a pure /dev/random-like source would cause concern, as long as you aren't tying to enact thousands of examples at once, or something. But from what I know of the uses of that tool, I can't see that being an issue. And any non-trivial algorithm will probably do you well enough.

Forgive me if I say so, but I've a feeling that you've launched yourself into a technical environment with a set of handy scripts and such that you're learning to use, and are yet attempting to be taken seriously as the kind of person who professionally works at codebreaking/codemaking. Hell, I'm paraphrasing half the things I'm saying, and yet do not pretend to understand all the subtleties of comparative effectiveness between (say) one hashing method and another. And someone's discovered that for certain inputs to a given encryption algorithm the solving difficulty has been reduced from O(!N) to O(N²)? Well, bully for them, but I couldn't necessarily follow their workings out...

We've also strayed well off the beaten track insofar as Dwarf Fortress discussion is concerned. As far as that is concerned, a random geographic seed (or the one for history-calculation, or for the naming of entities, or creature creation) is a number that has, say, somewhere in the magnitude of 2^32 different possibilities (four bytes'-worth, although in reality some of those numbers may well produce essentially identical worlds due to entropy lost in rejects, or specifically-calculated gem-veins being eroded out of existence as if they'd never been there in the first place), which means that you've got a lot of different worlds to play with. And who cares if you can look at a world and derive the seed without taking the specifically exported information about that, into account? Because there's no secrecy about the end-result, and if there's anything propriety about the algorithm in use it's more akin to an artistic technique. Nothing can really stop a forger working out how to plant the paint just so on their own copy (or upon a 'new, undiscovered piece' of their own fabrication that they wish to get officially attributed) to look genuine, but for most purposes using ones own artistic prowess in making a piece of art that is your own is going to be far more rewarding in the long-run, given the chance to make your own name. (Time-scales may be different between those who do Old-Master-pretence and those that would work in the software-equivalents, and when compared to life-spans and the relative speed and magnitude of rewards I can't see the latter finding it worthwhile to work this form of duplicity, even while the former still apparently is a viable earner...)

But I don't think you're even wanting to talk about that, and so maybe I waffle.

[1] If you consider the universe to be deterministic, all you are doing is using an incredibly complex and hard-to-analyse 'algorithm' to generate the pad. Should the pads owned by Alice and Bob have been derived from something "universally random" and yet Eve has access to some sufficiently complex subset of the Theory Of Everything that exactly predicts how the coin-flips have happened while generating that pad, then it's not secure. Of course, that 'algorithm' has so much hidden information that might be vital to knowing the next coin-flip's end-state (e.g. whereabouts on the the table the coin landed, so that the hand reaching for it grasps it in a particular way just prior to setting it up for the following flip, and so many other things) that the cracking-algorithm that had been shown to predict the first 100 flips (or, by elimination is the cracking-algorithm (or one of the few!) that has been successful in all prior (testable!) cases, is as likely to be wrong as right upon the 101st flip. Unless you're running an identical universe! Anyway, I diverge from my originally intended message.

News:

Author Topic: Mersenne twister in df (Read 10600 times)

BinaryBeast1010011010

Mersenne twister in df

PTTG??

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Uristocrat

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Warmist

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Starver

Re: Mersenne twister in df

Warmist

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Starver

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Warmist

Re: Mersenne twister in df

BinaryBeast1010011010

Re: Mersenne twister in df

Starver

Re: Mersenne twister in df